nadeemlab / SPT

Spatial profiling toolbox for spatial characterization of tumor immune microenvironment in multiplex images
https://oncopathtk.org
Other
21 stars 2 forks source link

Dockerize `spt cggnn train` #276

Closed CarlinLiao closed 10 months ago

CarlinLiao commented 11 months ago

Closes #275. This PR will remain a draft until #273 is accepted and merged, a new SPT version is released using it, and the functionality of this branch and the Docker image it uses is testable.

CarlinLiao commented 11 months ago

I am hard stuck on trying to run Docker (the spt-cg-gnn plugin) inside Nextflow inside Docker (the test suite harness).

When I have Nextflow run docker run, it finds the correct docker image (nadeemlab/spt-cg-gnn:latest) but the volume I passed from Nextflow is nowhere to be found, even if I copy the files in the Nextflow process script before running docker.

When I have use the Nextflow container field inside Docker, the files Nextflow prepares from inputs are easily found, but it's in the wrong Docker container. \$(docker inspect --format='{{.Config.Image}}' \$(cat /etc/hostname)) returns nadeemlab-development/spt-development:latest, not nadeemlab/spt-cg-gnn:latest.

Is it actually possible to run docker-in-nextflow-in-docker properly? Do we have to do this? The standard use case would stop at Nextflow running Docker, without the need to wrap another Docker layer around it all.

jimmymathews commented 11 months ago

I assume you don't mean literally using docker run inside a Nextflow process. That is not likely to work. It is not possible as far as I know to run docker inside a docker-contained Nextflow process, or any other docker-contained process.

Nextflow is itself an orchestration software. It is capable of doing allocation in a variety of environments, and it is responsible for summoning containers when asked. Docker compose is a lower-tier orchestration software, meant for prototypes in a simpler environment.

Our plan for a plugin system was to inform Nextflow of the correct (Singularity) container image to use for a specific process. The tests of such a system don't fit the pattern of our current integration tests, since these existing tests are based on a docker composition. We will have to write new tests that are shell scripts invoking Nextflow. Some Nextflow docs about this can be found here.

CarlinLiao commented 11 months ago

It is very much possible to run Docker-in-Docker. Docker in Nextflow in Docker may be a different story.

I've tried

None of the above work, but they don't break in anexpected way. Docker run does run, it just errors when trying to find the file I'm trying to provide it because I can't find a way to pass data from the Nextflow context into the Docker context.

I'll push my sketch implementations for review tomorrow.

CarlinLiao commented 11 months ago

To avoid a Docker-in-Docker situation, we decided that we would adjust the cggnn workflow test to run outside of the Dockerized test harness. It'll still query the (Dockerized) small test database, but its test script will now run outside of Docker. This will require changing the Makefile that runs the test suite.

Once this test is implemented and all tests pass, this PR will be ready for merge.

CarlinLiao commented 11 months ago

FYI from the Nextflow docs

When combined with the container directive, the beforeScript will be executed outside the specified container. In other words, the beforeScript is always executed in the host environment.

jimmymathews commented 11 months ago

Nice, that it is the first part that is needed from that line. Next what is needed is to ensure that the result of that script is actually exported to the level of the process definition (including container directive). Since this note indicates support in precisely the condition we face ourselves with, I'm willing to bet that it does work as expected. Most likely a configurable container is the reason for this behavior.

CarlinLiao commented 11 months ago

Something that I'm not a fan of is this.

I can confirm that (a) it is not possible to have only some of the processes running within Docker containers and that (b) this has been discussed among the developers and there are no plans to have this "feature".

So for cggnn.nf I have to specify a Docker container to run every single process in, I can't only have the train process run in a container and the rest on native.

I could use the development Dockerfile as a default, but then that isolated Docker container doesn't have access to the database server listening in my host environment. Why would they implement it like this???

I guess a workaround could be running Docker inside the Nextflow script and not wrapping the entire container around it?

CarlinLiao commented 11 months ago

Figured that issue out. Since each process needs a Docker container, I can just pass the development image to all non-train files, and provide the network that's already built for workflow testing to all of them.

This fixes everything, all that's left is to debug the cg-gnn image.

CarlinLiao commented 11 months ago

All tests pass. But the new non-Dockerized test's output is a little wonky.

workflow ┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈ Setup env.     (19s)     
┃ proximity pipeline ┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈ Passed.        (35s)     
┃ umap plot creation ┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈ Passed.        (37s)     
┃ cggnn ┈┈┈cat: status_code: No such file or directory
┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈ /home/carlin/projects/SPT/build/build_scripts/verbose_command_wrapper.sh: line 88: [: -gt: unary operator expected
Passed.        (46s)     
┗━ ┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈ Down.          (1s) 
CarlinLiao commented 11 months ago

Thank you for the notes on the build and test process. I finished your changes and now all tests pass without error in the tests or the scaffolding that runs them. I believe this is now fully ready to merge.

CarlinLiao commented 10 months ago

I accidentally merged this pull request just now, and there doesn't appear to be an easy way to unmerge it without reverting history. I'll leave it be for now pending an advisory on what to do.

jimmymathews commented 10 months ago

Manually squash merged into main.