NeuroDesk / neurodesktop

The plug-and-play, browser-accessible, containerised data analysis environment.
https://www.neurodesk.org
MIT License
43 stars 12 forks source link

RDP and VNC do not work after I stop and start the neurodesktop docker container #57

Closed civier closed 3 years ago

civier commented 3 years ago

Dear @aswinnarayanan ,

I had the pleasure to play with neurodesktop, and I must say that I'm delighted. You really did an impressive job of putting so much functionality into it. In a sense, putting all of VNC inside a container :-) Well done!

It works very smoothly, and maybe the only issue I found until now is that RDP and VNC do not work after I restart the neurodesktop container (running "docker stop " and then "docker start "). I'm not sure if the problem is specific to NECTAR instances, where I run Neurodesktop, so you're welcome to try it out also on other platforms. Notice that the command line interface (SSH) does work after a container restart.

We found out that restarting the container is an important functionality for us, as sometimes things stop working (like any computer), and the only solution that preserves the state of the container is restarting it. I also used the restart functionality when I setup VNM on NECTAR in Swinburne: after I ran some Swinburne-specific configurations inside a running VNM container, I stopped the container, and created a snapshot of the NECTAR instance. When somebody then request from me access to VNM, I simply launched new instances from that snapshot, and started the stopped VNM. I want to do the same with Neurodesktop, but for that, I need the start/stop functionality to work.

Thanks ahead for your assistance, Oren

civier commented 3 years ago

When I say that it doesn't work, I mean that when I try to connect from the browser, I get this screen:

Screen Shot 2021-10-13 at 9 59 15 am

Pushing either Home, Reconnect or Logout (and then connecting again), do not help.

stebo85 commented 3 years ago

Dear @civier

You shouldn't stop and start the container or keep the container long term. The neurodesktop container is designed to be ephemeral and needs to be updated regularly.

The issues you see are expected and are not so easy to address.

stebo85 commented 3 years ago

@aswinnarayanan, this issue could be related to the standby bug that @air2310 reported.

aswinnarayanan commented 3 years ago

@stebo85, it sounds related. I'll have a look into this.

civier commented 3 years ago

Dear @civier

You shouldn't stop and start the container or keep the container long term. The neurodesktop container is designed to be ephemeral and needs to be updated regularly.

The issues you see are expected and are not so easy to address.

Did you design the neurodesktop container to be ephemeral because making it persistent will require much more investment? or is it that the use cases you considered simply do not require persistency? At Swinburne, we have a use case that does require persistency (as I'll be managing the Neurodesktops containers for all our users, and updating them regularly might be too costly), and other sites might require persistency as well.

We also need to decide if the current version of Neurodesktop should accomodate expert users that install a lot of stuff inside the container (and then reinstalling everything every time you update/run Neurodesktop might not be practical, even if you have a script that does it) or only novice users that do not install stuff by themselves. Regarding this point, this part of the documentation is relevant as well: https://neurodesk.github.io/docs/neurodesktop/whats-next/#how-to-keep-your-modifications-in-the-container

I tag @TomEmotion, @willw1, @DavidjWhite33, so they can join the conversation.

stebo85 commented 3 years ago

Dear @civier,

Containers are not designed to be persistent - they are designed to be ephemeral. I do not see a feasible way of achieving a use-case where one would keep the container for a long time and modify settings and tools inside the container. If you do see a way of doing this you are welcome to contribute a proof of concept.

Kind regards Steffen

civier commented 3 years ago

Dear @civier,

Containers are not designed to be persistent - they are designed to be ephemeral. I do not see a feasible way of achieving a use-case where one would keep the container for a long time and modify settings and tools inside the container. If you do see a way of doing this you are welcome to contribute a proof of concept.

Kind regards Steffen

Dear Steffen,

I'm not an exert in docker at all, but I'd be happy if you can refer me to information on the notion that "Containers are not designed to be persistent". I thought that now there are persistent web services that are shipped in containers and are designed to run continuously, but please correct me if wrong.

Always happy to learn :-) Oren

stebo85 commented 3 years ago

Dear @civier,

no, containers have a short lifespan in webservices. They are continuously updated and run only for short amounts of time. The persistent webservice illusion is created using Orchestration engines like Kubernetes that manage the lifecycle of containers, a good starting point to understand this is here: https://en.wikipedia.org/wiki/Kubernetes

I hope this helps Cheers Steffen

aswinnarayanan commented 3 years ago

Hi @civier. Will have a look into the VNC/RDP sleep issues, as it could be some underlying bug.

However, I have to echo @stebo85. I think containers are by design stateless and ephemeral. I'm not sure that our team can address the various issues that arise with running container persistently. The industry standard has trended towards ephemeral containers, connected to persistent storage for state-fullness, and neurodesktop is in-line with this design. Also (unless its running some job), running an interactive desktop persistently also uses up the underlying cloud resource while the desktop is not being used.

Persistently running a container also runs anti-thetical to reproducability. If the container can't be easily restarted, it is by definition less-reproducible. If the the container is easily reproducible, it wouldn't need to run persistently and it only needs to be started up when required.

Expert user can extend the Dockerfile or, as stebo85 pointed out, can use docker commits to save and and reuse their customised container.

*There is too many articles on containers and running them ephemerally to recommend. Best bet would be looking up a few of them.

At Swinburne, we have a use case that does require persistency (as I'll be managing the Neurodesktops containers for all our users, and updating them regularly might be too costly), and other sites might require persistency as well.

I didn't quite understand this, particularly why the container has to be persistent. Can you, for example, save a snapshot of the cloud VM with the correct data mounts and docker pre-installed. Then the user runs the single-line command that starts up their neurodesktop when they need it? (Am aware that the container starts slowly, but this is likely just a bug which can be resolved)

civier commented 3 years ago

Dear @civier,

no, containers have a short lifespan in webservices. They are continuously updated and run only for short amounts of time. The persistent webservice illusion is created using Orchestration engines like Kubernetes that manage the lifecycle of containers, a good starting point to understand this is here: https://en.wikipedia.org/wiki/Kubernetes

I hope this helps Cheers Steffen

Thanks @stebo85 BTW, what is it in the technology/design of docker that does not fit well with persistency? We used the VNM as a persistent desktop for months with hardly any problem. And if something stopped working after all, we just restarted the container (docker stop and start) and all good.

civier commented 3 years ago

Hi @aswinnarayanan

Regarding your question:

At Swinburne, we have a use case that does require persistency (as I'll be managing the Neurodesktops containers for all our users, and updating them regularly might be too costly), and other sites might require persistency as well.

I didn't quite understand this, particularly why the container has to be persistent. Can you, for example, save a snapshot of the cloud VM with the correct data mounts and docker pre-installed. Then the user runs the single-line command that starts up their neurodesktop when they need it? (Am aware that the container starts slowly, but this is likely just a bug which can be resolved)

The issues is that I need to do some Swinburne-specific amendments within the container. So every time you release a new version of Neurodesktop, I plan to do them manually, test, stop the container and then take a snapshot. This way I know that everything works well. It is true that I can do my changes using a script that I will run inside the container every time a user runs neurodesktop, but this will require to test the script on every new version of neurodesktop, thus, more of my time. In other words, it is not impossible for me to do what you suggest, but will just be more costly.

stebo85 commented 3 years ago

Dear @civier,

To answer your question:

BTW, what is it in the technology/design of docker that does not fit well with persistency?

Dockercontainers are designed to be build from a dockerfile and then not changed not changed during runtime. There are no good mechanisms in docker to store run-time changes of a container (except docker commit, which cannot be recommended) as it would lead to unreproducible containers.

We used the VNM as a persistent desktop for months with hardly any problem. And if something stopped working after all, we just restarted the container (docker stop and start) and all good.

Yes, it can work, but you NEED to be aware that you are using it outside of the specs and you will always end up with problems that we will not be able to support you in.

The issues is that I need to do some Swinburne-specific amendments within the container. So every time you release a new version of Neurodesktop, I plan to do them manually, test, stop the container and then take a snapshot. This way I know that everything works well. It is true that I can do my changes using a script that I will run inside the container every time a user runs neurodesktop, but this will require to test the script on every new version of neurodesktop, thus, more of my time. In other words, it is not impossible for me to do what you suggest, but will just be more costly.

What are these changes?

aswinnarayanan commented 3 years ago

Hi @civier. I've applied some changes that may resolve this VNC/RDP issues Can you please test on the latest neurodesktop-dev and let me know. You'll need to run the following commands to startup the dev version

docker pull vnmd/neurodesktop-dev:latest
sudo docker run \
  --shm-size=1gb -it --privileged --name neurodesktop-dev \
  -v ~/neurodesktop-storage:/neurodesktop-storage \
  -e HOST_UID="$(id -u)" -e HOST_GID="$(id -g)" \
  -p 8080:8080 -h neurodesktop-dev \
  vnmd/neurodesktop-dev:latest
civier commented 3 years ago

What do you mean by the latest neurodesktop-dev? Did you upload it to Docker hub (what is the image name exactly?), Or should I build it from Neurodesktop GitHub?

Oren

On Wed, 13 Oct 2021, 15:20 Aswin Narayanan, @.***> wrote:

Hi @civier https://github.com/civier. I've applied some changes that may resolve this VNC/RDP issues Can you please test on the latest neurodesktop-dev and let me know

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NeuroDesk/neurodesktop/issues/57#issuecomment-941906877, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB7FUIZMYMFXP3T43YUHIGTUGUCIFANCNFSM5F3ZUW7A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

aswinnarayanan commented 3 years ago

It's on dockerhub. It accessible using the command in the previous message.

civier commented 3 years ago

Also (unless its running some job), running an interactive desktop persistently also uses up the underlying cloud resource while the desktop is not being used.

Hi @aswinnarayanan

Regarding the above comment you made -- a NECTAR instance for example can be shelved (effectively, creating a snapshot of it) and then restored. Right now it is not simple for users to do it on their own, but I guess it can be automated. This way users would be able to free the resources of their instance, and return to work on it later. But this of course will require to stop and then start Neurodesktop.

I'm pretty sure that something like this is already automated in commercial cloud services. Can you do it easily on the Oracle cloud that you're using? Maybe @stebo85 knows.

TomEmotion commented 3 years ago

Hi all. Oren: I think we can discuss our use cases in-house and see how best to manage them within the container as ephemeral mode. There are many reasons, I believe, to configure as many Swinburne-specific (or for that matter, site-specific) aspects as possible through scripts that reside on persistent storage, or through settings of the host Nectar instance.

Where we confront specific problems, that will usefully feed into development of the platform (e.g. by identifying any underlying gaps in its functionality).

One question I do have, though, Steffen and Ashwin: when you say "The neurodesktop container is designed to be ephemeral and needs to be updated regularly" I assume that any such updates are flagged in a way that users can easily be made aware) and that users can always continue using older versions as long as they wish (as essential requirement for reproducibility, unless we are acting on the assumption that only the containers and conda environments within Neurodesktop need to be strictly reproducible).

Final note: although quick & responsive is fantastic, my input to these discussions will invariably be slower and will rarely if ever happen outside regular work hours. So for non-urgent discussions, at least, please be patient with me :-)

TomEmotion commented 3 years ago

Please ignore my question regarding the updating - I see it is addressed in the Documentation. Thanks.

aswinnarayanan commented 3 years ago

users can always continue using older versions as long as they wish (as essential requirement for reproducibility, unless we are acting on the assumption that only the containers and conda environments within Neurodesktop need to be strictly reproducible).

Hi @TomEmotion, @civier. Just my take on this topic.

The neurocontainers offers a significant leap in reproducibility over natively installed software and libraries; and it makes sense to pin these software container versions for pipelines (unless you need the fixes or features in the updated version).

But the reproducibility of neurodesktop lies more in the ability to relaunch any historical version when this is required. For usual work (developing pipelines and workflows), I think it is recommended to update the neurodesktop container within a reasonable update window.

The neurodesktop version pinning would be better for publishing or sharing a neurodesktop to collaborators, using specific workflows that are very sensitive to the environment, and rolling back for debugging and testing purposes. This level of reproducibility comes at a significant cost of losing new features, bug fixes and security updates. And due to the benefits of containerising the software packages, the trade-off is not as worth it for a general user.

We're trying to figure out better ways of communicating information on new versions. There is an open issue https://github.com/NeuroDesk/neurodesktop/issues/50. Happy to accept help on the issue.

stebo85 commented 3 years ago

Dear all,

Fully agree with what Aswin wrote, but I can add one more little clarifikation: The neurodesktop container is not essential for reproducing results as everything science-critical is in singularity subcontainers. These subcontainers do not change with neurodesktop versions. The desktop container has to be updated very often to make sure that new containers get added and to ensure that security patches are applied. Therefore the desktop container should not be changed during runtime. If there is anything you do need to adjust please let us know and we can think about a good way of achieving this.

Cheers Steffen

civier commented 3 years ago

The desktop container has to be updated very often to make sure that new containers get added

Dear @stebo85

Can you clarify why a neurodesktop update is necessary for new containers to be added? Can't we come with a mechanism that add them dynamically? I think already now there is no issue with using new containers through the command line (correct me if wrong), so we probably can also find a solution for the menu entries. Can't we?

Best, Oren

stebo85 commented 3 years ago

@civier - we HAD a dynamic way of updating the containers and you argued against it and we agreed that its better for reproducibility to fix the set of containers per neurodesktop version. Now this update has to be run manually because it would lead to an unreproducible desktop container, so we made it explicit.

civier commented 3 years ago

Wow. We had so many iterations, and I cannot even recollect all our discussions, so sorry if I contradicted myself. Let me do my homework on that.

civier commented 3 years ago

Thanks so much @aswinnarayanan for fixing it so quickly. It now works fine when I run neurodesk-dev. We just need to recommend people to use the -a flag when running "docker start". By default, stdout and stderr are not redirected to your shell when you do docker start, so you cannot know when the container is actually ready for access again.

Just notice that when restarting, there is an error on mkdir inside "Mounting CVMFS". I don't think it's a problem, but better to catch it to keep the log clean (will be easier to detect real errors).

Mounting CVMFS

mkdir: cannot create directory ‘/cvmfs/neurodesk.ardc.edu.au’: File exists CernVM-FS: running with credentials 109:115 CernVM-FS: loading Fuse module... done CernVM-FS: mounted cvmfs on /cvmfs/neurodesk.ardc.edu.au

Last thing on your comment:

It's on dockerhub. It accessible using the command in the previous message.

I think that you added the command when editing the original post, and unfortunately, Github does not resend an email with the edited post. Do you know if there is any option to make Github re-email edited posts? Given that Github allows replying directly from the email (which I do sometimes), it can become very confusing.

civier commented 3 years ago

Dear @civier,

Containers are not designed to be persistent - they are designed to be ephemeral. I do not see a feasible way of achieving a use-case where one would keep the container for a long time and modify settings and tools inside the container. If you do see a way of doing this you are welcome to contribute a proof of concept.

Kind regards Steffen

Hi @stebo85

I have some further clarification questions on the subject. We will hopefully discuss it in SNI tomorrow (Thursday) at 10am, so if you get to that by then, would love to have your responses.

Unpacking what you're saying, there are thee points that can stand on their own (I think):

1) Containers are not designed for users to modify settings and tools inside This is quite clear. As Tom said, we will discuss it in SNI to see how it affects our workflows. I just have one question: is this something specific to Docker, Singularity, or do you see it as a general issue with container technology?

2) Containers are not designed to be run for a long time Some of the users in SNI would like to run Neurodesktop for a long time, even if they don't change any permanent things inside. This is for the mere convenience of not having to open all the graphical applications again, for example, if you were in the middle of a complex analysis with many windows open, or if you changed many temporary settings through the GUI. How long can a container be run for safely? Can you give an approx. number of days?

3) Starting and stopping a container From what you're saying, I'm not sure if "docker stop" and "docker start" are problematic by themselves? Do you think that containers are less stable or perform worse after restarting them in this way?

Thanks for taking me (us) through that, Oren

stebo85 commented 3 years ago

Do you know if there is any option to make Github re-email edited posts?

no I am not aware of a feature like this

Containers are not designed for users to modify settings and tools inside This is quite clear. As Tom said, we will discuss it in SNI to see how it affects our workflows. I just have one question: is this something specific to Docker, Singularity, or do you see it as a general issue with container technology?

It is a general aspect of container technology. Containers are designed to be stateless.

Containers are not designed to be run for a long time Some of the users in SNI would like to run Neurodesktop for a long time, even if they don't change any permanent things inside. This is for the mere convenience of not having to open all the graphical applications again, for example, if you were in the middle of a complex analysis with many windows open, or if you changed many temporary settings through the GUI. How long can a container be run for safely? Can you give an approx. number of days?

You can run in until something crashes inside the container. This will depend on the use of the users and how lightweight we keep the desktop container (the less things can go wrong). Then a restart of the container would be necessary and GUI changes will be gone.

Starting and stopping a container From what you're saying, I'm not sure if "docker stop" and "docker start" are problematic by themselves? Do you think that containers are less stable or perform worse after restarting them in this way?

docker stop and start should work now with the desktop container and the reason it wasn't working earlier was a bug. Since we currently do not have the resources to support this workflow I am sure there are more bugs that would need to be fixed. The question we need to ask is #58 - it is possible to support this but it will cost time that we cannot spend on other things. Currently there is no contribution to the code from Swinburne or Sydney and we have to accept that Aswin and I have limited capacity in implementing and supporting features.