mviereck / x11docker

Run GUI applications and desktops in docker and podman containers. Focus on security.
MIT License
5.59k stars 375 forks source link

JOSS: Recommended reference / citation for x11docker in academic context #92

Closed nuest closed 5 years ago

nuest commented 5 years ago

Hi! Very useful tool, thanks for your work. I'm in the process of using/recommending it in a scientific article about using containers for preservation of computational workflows. I did not find a preferred way to cite this tool, and this might be not relevant for you at all. Nevertheless, I'd be happy to help it you're open to make this tool "citable", thus making it traceable who in research uses it and writes about it.

There are two ways to achieve this:

mviereck commented 5 years ago

Thank you, I am honored that you want to cite my project in a scientific article!

Currently I try to write a short description addressing readers that are not familiar with the background of docker and X. Not an easy task as I am blind for the perspective of someone not knowing either of them. ;-)

This description would be useful for both ways, Zenodo or JOSS, and would also be a good introduction in the README.md. If you already have an idea or suggestion for a short and general description, please let me know!

A first attempt:


x11docker allows to run graphical applications in docker linux containers.

This can help to run or deploy software that is difficult to install on several systems due to dependency issues. It is possible to run outdated versions or latest development versions side by side. Software can be installed in a deployable docker image with a rudimentary linux system inside. x11docker only supports linux containers. It is possible to run linux containers on a Windows host, too.

nuest commented 5 years ago

For newcomers, I would suggest to include more links in the description, ideally to Wikipedia pages since they are usually at a level addressing non-experts, and they tend to stay available. "Docker", "Linux containers", "kernel namespace", "X server", "graphical applications", "virtual machine", "X security leaks" ... a lot of lingo :-) Maybe (!) you know a good website giving an overview on the security risks that you could link to here?

Other than that I like it! Better to keep it clear and simple and use links for details.

Would you like me to make some suggestions on the first paragraph, or rather do a further iteration yourself?

Do you have a clear preference of JOSS vs. Zenodo?

mviereck commented 5 years ago

I would suggest to include more links in the description, ideally to Wikipedia pages

I've included several links. I was not sure links would be allowed in the paper.

Maybe (!) you know a good website giving an overview on the security risks that you could link to here?

The current link points to the security paragraph in README.md. It gives an overview of x11docker security settings and provides links to further background information.

Would you like me to make some suggestions on the first paragraph

It is quite appreciated! Don't hesitate to point on ugly use of english language. I am well aware that all x11docker documentation would benefit from a critical spell check.

Do you have a clear preference of JOSS vs. Zenodo?

It would be nice to get into JOSS. Zenodo would be a fallback possibility if JOSS does not accept x11docker for some reasons.

Second version:


x11docker allows to run graphical applications in Docker Linux containers.

Docker allows to install software in a deployable Docker image with a rudimentary Linux system inside. This can help to run or deploy software that is difficult to install on several systems due to dependency issues. It is possible to run outdated versions or latest development versions side by side. x11docker alleviates usage of Docker for end users.

x11docker runs on Linux and (with some setup) on Windows. x11docker is not adapted to run on macOS.

mviereck commented 5 years ago

I have created a paper.md for submission to JOSS. Would you mind to have a look at it?

There is just one point in the submission requirements where I am not sure about:

The software should have an obvious research application

I am pretty sure x11docker can be useful in a scientific context, e.g. in development, deployment or in running quite old scientific software. But it may be not obvious from the current description. Do you have a clever idea how to build the bridge to the scientfic context in paper.md so x11docker has an obvious research application?

nuest commented 5 years ago

I'll take a look - might take a few days though. Thanks for taking this up!

Re. scientific context: containerisation is broadly discussed as a mean for better reproducibility, yet mostly for client/server-based applications (e.g. RStudio, Jupyter), so x11docker can close a gap there.

yxliang01 commented 5 years ago

@nuest I think your idea of publishing this in academia is cool! However, I think "containerisation is broadly discussed as a mean for better reproducibility" is too general and not specific to x11docker. Maybe we can say:

Docker does not provide a display server that would allow running applications with a graphical user interface. While we can share the host display server conveniently to the container to get the GUI application in container working, this scarifies the security and reproducibility. x11docker allows research prototypes with GUI quickly to be built with the display server set up in a container in a secure and reproduciable fusion.

For now, I think the academic contribution of x11docker is making building research prototypes easier and faster.

nuest commented 5 years ago

@yxliang01 Thanks for the feedback - I tried to incorporate your perspective/statement of need in my rewrite of the article: #97

I added some context and related work. Let me know what you guys think!

eine commented 5 years ago

Hi @nuest! I have read your PR and I am concerned about the following sentence:

This allows a sandbox environment that fairly well protects the host system from possibly malicious or buggy software.

Containers in general, and docker in particular, are not sandboxes. Even if they can be set up to be less inscure, where x11docker does an amazing job, shouldn't we be cautious about using that term?

mviereck commented 5 years ago

Hi @1138-4EB , thank you for looking at this, too! You are right, "sandboxing" is something that is seen quite critical by many people. Also it is not an important aspect for scientific use. It is probably wise to drop this sentence from paper.md. However, sandboxing is still part of README.md with some notes about sandbox weaknesses. Some thoughts of you about the sandbox chapter are quite appreciated!

Thank you @nuest for your work on this! Overall it looks well. I am still reading and thinking about it. Some points I've seen so far:

, because Docker it is originally built for server software.

It seems the "it" is a double of "Docker".

Docker does not provide a display server that would allow to run applications with graphical user interface (GUI), because Docker it is originally built for server software. The common way to provide is by providing a web server and rendering an HTML-based GUI in a common web browser, e.g. as notebooks [@jupyter2018binder].

x11docker fills this gap.

This part is less clear. After mentioning the webserver solution it is not obvious which gap x11docker is filling. Also it is not obvious who is running the webserver. I assume you mean a webserver running in container providing HTML5 access.

Maybe it is better to note alternatives to x11docker at the end. A VNC server within container is a possibility, too. Both solutions, VNC and HTML5, add some overhead within the container, and need some special docker command setup to forward TCP ports. An SSH server in container is possible, too. All those solutions need some setup skills. x11docker is easier to use and could provide some sort of unified setup.

researcher's preferences (e.g. for a specific operating system),

x11docker is heavily tested on several Linux systems. For MS Windows I currently have feedback from @1138-4EB only. That means, either it causes no issues, or no one else uses it on Windows :). So far, I am not sure how reliably x11docker works on Windows. x11docker is not adapted to macOS yet. I consider to buy an old macbook just for adaption of x11docker. Now after X-Mas there are several cheap old macs on the market.

I assume if x11docker runs reliably on Linux, Windows and macOS with same start command on all systems, it provides a unified and simplified access to reproducible scientific containers. This also would be true for non-GUI applications that could benefit e.g. from container user setup, shared files and printer access. (Printer access is not implemented for Windows yet. Though, I have an idea for it.)

nuest commented 5 years ago

Re. sandboxing: I did not take into account the various interpretations that readers might have with this term, so I'd be happy to see this rephrased. Suggestions?

It seems the "it" is a double of "Docker".

Correct!

I assume you mean a webserver running in container providing HTML5 access.

Yes, I do. Mentioning VNC server as an alternative is a good idea.

Re. operating systems: I think it is fair to say that there is limited Windows support, but of course be transparent. Would you like to rephrase this?

[I have an extra Window 10 machine and could run some tests (which ones? is there a list of images that covers a good amount of features?).]

Let me know if you'd like to do the next round yourself or if I should make an update.

mviereck commented 5 years ago

[I have an extra Window 10 machine and could run some tests (which ones? is there a list of images that covers a good amount of features?).

Possible features to test would be --gpu and --pulseaudio. More important is to know whether x11docker works reliable at all. I don't see an urgent need for special tests. If I have an idea for important tests I'll let you know. Edit: You could set up an arbitrary running x11docker session and note at which points you have setup issues to solve, it might be installation, giving permissions or whatever. I could improve the documentation on how to set up docker-Xserver-bash-x11docker overall.

Let me know if you'd like to do the next round yourself or if I should make an update.

I'll do the next round. :)

yxliang01 commented 5 years ago

@nuest I think #97 is great! I have made some potential improvements (to me) to the paper in #98 . I think it would be helpful if we put any research prototypes or research-related open source projects as use cases into the paper if there is any. I think it would outweigh lots of paragraphs in the paper :) But, I am not aware of any yet.

Also, in the paper,

A container is similar to a virtual machine, but needs less resources.

It is common for people to use virtual machines as an analogy to containers. But, I think if we only say "it is similar, but needs less resources" doesn't really capture the main differences in scientific context between these two (fully isolated kernels vs shared kernel, etc...). But, if we talk about the differences between these two too much, it might defeat the purpose of using virtual machine as an analogy. Therefore, I propose removing this sentence. Please let me know what you think.

nuest commented 5 years ago

Regarding references to other projects: I am not aware of anything similar to x11docker, would of course be good to add connected works. Let us know if you find something.

Regarding VM comparison: I would be happy to go without that comparison, the references do explain those details.

@mviereck I forgot to add one important refences, already in the bib: The [] should be "make it a promising candidate to increase computational reproducibility and reusability of research analyses [@boettiger_introduction_2015]".

yxliang01 commented 5 years ago

@nuest Updated #98 as per https://github.com/mviereck/x11docker/issues/92#issuecomment-454087944

mviereck commented 5 years ago

Thank you, @yxliang01, for looking at this, too! I've merged your commit and made an update of paper.md, too, including your changes.

Mainly I changed the order of the content.

Regarding VM comparison: I would be happy to go without that comparison, the references do explain those details.

I have reintrodued a variation of the sentence because I think it helps to compare containers with the more familar virtual machines. If you still think it should be removed, ok.

A container is similar in usage to a virtual machine, but needs less resources. The technical concept, however, is completly different.

I forgot to add one important refences

It is added now.

Regarding references to other projects: I am not aware of anything similar to x11docker, would of course be good to add connected works.

One I am aware of is subuser (on github). It uses Docker containers to isolate applications and uses xpra to isolate from host X. But its concept is quite different from x11docker. It is rather some sort of package manager. Also, firejail comes to mind. I think, both subuser and firejail are less useful for the intended deployment of reproducible research containers. Their design rather targets integration of regular desktop applications. I admit, I am not familar with both of them. They might be more useful than I think.

x11docker thereby facilitates quick creation, distribution, and evaluation of research prototypes without compromising on a researcher's preferences (e.g. for a specific operating system), skills (not imposing browser-based GUI nor requiring command-line proficiency), domain (having e.g. established and widely-acknowledged GUI-based tools), security, computational reproducibility, or a scholarly review process.

This sentence is hard for me to understand. However, it sounds impressive :-). I am not sure if it should be written easier.

Mostly new:

Alternatives to x11docker: A common way to allow GUI applications in containers is by providing a web server within the container and rendering an HTML-based GUI in a common web browser, e.g. as notebooks [@jupyter2018binder]. Further possibilities are a VNC server, SSH server or xpra server within the container. These solutions require some specific setup and provide a rather slow interaction due to a lot of network data transfer. x11docker provides a unified setup and a fast interaction due to direct access of GUI applications to the X display server.

x11docker runs on Linux and with few limitations on MS Windows. Support for macOS is scheduled. It has its own graphical frontend, x11docker-gui, and can be configured to access the host machine's GPU, webcam, or audio system. It also allows access to files created by container applications.

This sounds a bit confusing to me, but maybe only due to my limited english speech:

, e.g. as notebooks [@jupyter2018binder]

Maybe writing it:

, e.g. like in notebooks [@jupyter2018binder]

eine commented 5 years ago

Hi @1138-4EB , thank you for looking at this, too!

Always glad to help!

However, sandboxing is still part of README.md with some notes about sandbox weaknesses. Some thoughts of you about the sandbox chapter are quite appreciated!

I'm a hardware guy, so my knowledge about software security is limited. However, IMHO, the best approach is to avoid using the term, and describe the features instead. Those who know what a sandbox is, will see the similarities and will ask themselves why the term is avoided. At the same time, users who have heard about the term but are not sure about the meaning, will search for docker sandbox or x11docker sandbox and they will find proper sources (such as the REAMDE or this issue). So, we are providing information by omission, because of the context.

That is, it might be sensible to modify/remove:

Additionally, x11docker has a specific security setup to enhance container isolation from host system and to avoid X security leaks.

x11docker provides several features to enhance container isolation from the host. Although X security is one of the concerns, it is not the only one.

@1138-4EB only. That means, either it causes no issues, or no one else uses it on Windows :). So far, I am not sure how reliably x11docker works on Windows.

I've been intensively using it and so far it works pretty well. However, I've been using v5.2.0 these last months. Today I tried >=v5.3.0 and some bug has been introduced. I'll open an issue about it.

This also would be true for non-GUI applications that could benefit e.g. from container user setup, shared files and printer access. (Printer access is not implemented for Windows yet. Though, I have an idea for it.)

I think it'd be interesting to add this sentence to paper.md. Maybe after the sentence commented above (Additionally, x11docker has a specific security setup...).


x11docker is not adapted to macOS yet. I consider to buy an old macbook just for adaption of x11docker. Now after X-Mas there are several cheap old macs on the market.

Have you considered adding some note to the README or some pinned issue to let users know that you are willing to accept the donation of some old macbook? You might also enable some crowdfunding if you find any interesting sale.

eine commented 5 years ago

I did the changes suggested in the previous comment and opened a PR: #101.

@nuest, @mviereck how can I compile the markdown source and the bib file to a HTML or PDF? Do you use pandoc?

mviereck commented 5 years ago

@1138-4EB Thank you very much! I'll look closer tomorrow.

how can I compile the markdown source and the bib file to a HTML or PDF?

I don't know. The files are expected as markdown source by JOSS.

Have you considered adding some note to the README or some pinned issue to let users know that you are willing to accept the donation of some old macbook? You might also enable some crowdfunding if you find any interesting sale.

That is a good idea!

nuest commented 5 years ago

I tried to render the PDF using the JOSS toolchain, and did not succeed. When you submit, a bot will provide a rendered version for the review, and I think that's enough most of the time.

nuest commented 5 years ago

We should add a reference to https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0152686

See also https://github.com/WebDataScience/GUIdock

At a first glance I don't know if they are doing the same thing or where projects differ.

Citation for paper.bib


@article{hung_guidock:_2016,
    title = {{GUIdock}: {Using} {Docker} {Containers} with a {Common} {Graphics} {User} {Interface} to {Address} the {Reproducibility} of {Research}},
    volume = {11},
    issn = {1932-6203},
    shorttitle = {{GUIdock}},
    url = {http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0152686},
    doi = {10.1371/journal.pone.0152686},
    abstract = {Reproducibility is vital in science. For complex computational methods, it is often necessary, not just to recreate the code, but also the software and hardware environment to reproduce results. Virtual machines, and container software such as Docker, make it possible to reproduce the exact environment regardless of the underlying hardware and operating system. However, workflows that use Graphical User Interfaces (GUIs) remain difficult to replicate on different host systems as there is no high level graphical software layer common to all platforms. GUIdock allows for the facile distribution of a systems biology application along with its graphics environment. Complex graphics based workflows, ubiquitous in systems biology, can now be easily exported and reproduced on many different platforms. GUIdock uses Docker, an open source project that provides a container with only the absolutely necessary software dependencies and configures a common X Windows (X11) graphic interface on Linux, Macintosh and Windows platforms. As proof of concept, we present a Docker package that contains a Bioconductor application written in R and C++ called networkBMA for gene network inference. Our package also includes Cytoscape, a java-based platform with a graphical user interface for visualizing and analyzing gene networks, and the CyNetworkBMA app, a Cytoscape app that allows the use of networkBMA via the user-friendly Cytoscape interface.},
    number = {4},
    urldate = {2016-04-26},
    journal = {PLOS ONE},
    author = {Hung, Ling-Hong and Kristiyanto, Daniel and Lee, Sung Bong and Yeung, Ka Yee},
    month = may,
    year = {2016},
    keywords = {Computer software, bioinformatics, Operating Systems, Reproducibility, Software tools, Graphical user interface, Gene regulatory networks, Systems biology},
    pages = {e0152686}
}
sgyzetrov commented 5 years ago

Says here in their paper, I quote:

Containers have not been used to distribute the normal GUI based workflows that users are accustomed to.

So it looks like they are the first to come up with X Windows software layer configured Docker using the following solution:

The solution adopted by GUIdock is to pass the container X Windows commands to a host X Window emulator which renders the GUI.

Although the GitHub repo no longer seems to be active after their paper got published...

eine commented 5 years ago

At a first glance I don't know if they are doing the same thing or where projects differ.

According to section GUIdock: on Microsoft Windows operating systems:

We use a lightweight application, MobaXterm [40] for this purpose. Although MobaXterm is proprietary, a full-featured free version is available for download at http://mobaxterm.mobatek.net/download.html. MobaXterm, provides X Windows support and supports ssh (secure shell) tunneling. (...) Using MobaXterm, we set up X11 forwarding using ssh to connect the Docker container with a MobaXterm terminal. The GUI commands pass through the ssh tunnel to the MobaXterm X Windows emulator which renders the GUI on the host system.

So, their approach seems to require a SSH server/daemon to be running inside the container. It is likely that MobaXterm does not explicitly provide the ability to run custom X server. Instead, the interface is ssh -X. But I am not sure about, since I have not used it and the sources are not available. If so, the advantages of x11docker are i) not requiring additional dependencies in the image, and ii) better performance.

Moreover, in section GUIdock: on Mac OS they explain that they use an X server on the host, and socat in order to bind it to a TCP port in the container. Therefore, I think that they didn't know/try vcxsrv or cygwin/x. Otherwise, the solution on windows would probably have been this, which is what x11docker does.

Overall, I think that more effort is put in presenting the example images as complete, ready-to-use and useful products, rather than in the technical details of the alternatives and the chosen solution in each platform. Hence, the target audience are researchers in systems biology.

mviereck commented 5 years ago

About paper.md of x11docker: Overall I think it looks pretty well and we can provide is to JOSS soon. At JOSS it will be reviewed, probably some minor changes will come up.

Two points I am not sure about:

Other similar projects are subuser and firejail. However, these rather target integration of regular desktop applications, i.e. their design is closer to some sort of package manager.

Maybe we could drop that at all. firejail and subuser are not exactly matching the target. I would not like to cite GUIdock, see below. I might be biased, but I don't see a real alternative to x11docker except for custom setups like those described in x11docker wiki. (I might add some paragraphs for custom setups on Windows and macOS, too).

x11docker has its own (optional) graphical frontend, x11docker-gui, and runs on GNU/Linux and with few limitations on MS Windows. Support for macOS is scheduled.

Maybe it should be said that running x11docker in a Linux VM on macOS and Windows is fully supported. I cannot guarantee it for running natively on Windows (and maybe in future on macOS) as there might come up issues I cannot fix at all. Compare issues #104 and #108 where x11docker cannot do anything about and weird fixes by the user are needed.


Re: GUIdock

Overall, I think that more effort is put in presenting the example images as complete, ready-to-use and useful products, rather than in the technical details of the alternatives and the chosen solution in each platform.

That is been said nice. :) I had a look at the code of GUIdock. To be honest, it is crap. The developers spent a lot of effort in designing images and writing an impressive paper. But the code is barely usable. Looking at the Linux code:

The scripts for macOS and Windows do not look better. It seems to me that the project aims to create impressive reputation without having any substantial content. Instead, if the developers would have written a short blog post explaining basic setups for X access on Linux, Mac and Windows, they would have done a valuable work.

I think GUIdock should not be honored with a citation in paper.bib.

So it looks like they are the first to come up with X Windows software layer configured Docker using the following solution:

There is a thread on stackoverflow from 2013 showing some setup examples using X from host. subuser was published in 2014. x11docker 1.0 was published in 2015. GUIdock was published in 2016, short after an answer on stackoverflow for solutions on Windows and macOS. So they haven't been the first with this solution, but the first publishing a paper about this.

eine commented 5 years ago

Today I found katacontainers.io (https://www.youtube.com/watch?v=vK_gdy2kdPM), since it was the default runtime in a server I was using. I think that it might be worth a reference from the security/sandboxing point of view.

Furthermore, I wonder if it is worth testing x11docker with kata.

mviereck commented 5 years ago

I have submitted to JOSS and am awaiting the begin of the review process. http://joss.theoj.org/papers/7ff985f2699880f77b86209b98c0d98d

I think that it might be worth a reference from the security/sandboxing point of view.

The sandbox aspect is no longer part of paper.md, and kata does not seem to support GUI applications. So I don't think a refernece makes sense yet. But thanks for the hint!

mviereck commented 5 years ago

JOSS review ticket: https://github.com/openjournals/joss-reviews/issues/1346

eine commented 5 years ago

I think that it might be worth a reference from the security/sandboxing point of view.

The sandbox aspect is no longer part of paper.md, and kata does not seem to support GUI applications. So I don't think a refernece makes sense yet. But thanks for the hint!

Agree. I was not thinking of adding it to paper.md explicitly, but rather do it anywhere in the repository/wiki where sandboxing is discussed. I think that the discussion is quite simple:

If you want proper sandboxing, kata illustrates the complexity that is involved. x11docker does the best it can with off-the-shelf resources in the stardard docker runtime (runc).

About GUI applications and kata, did you actually try it or is there any specific reason why you think that it is not supported? I ask it because kata is expected to replace runc, so the docker run command does not change at all. Therefore, it should be possible to share an X server from the host through a TCP port. For example, sharing folders with -v works as expected.

mviereck commented 5 years ago

Let's discuss kata in #138. It is not related to the JOSS submission yet.

mviereck commented 5 years ago

The paper is finally accepted into JOSS: DOI

:-)

Much thanks @all!