BodenmillerGroup / steinbock

A toolkit for processing multiplexed tissue images
https://bodenmillergroup.github.io/steinbock
MIT License
49 stars 14 forks source link

Can't run with singularity #155

Closed paulrbuckley-kcl closed 1 year ago

paulrbuckley-kcl commented 1 year ago

Hi,

I'm very interested in using steinbock. In a HPC environment I cannot use Docker thus I am trying to use singularity. I am wondering if you have any guidance for how / if steinbock can be used with singularity? I have converted to a .sif file, but if I run steinbock in an interactive shell to preprocess, I receive errors that "command cannot be found". If I run 'singularity run *.sif' I receive the below.

I am wondering if you have had any success with steinbock on singularity..

Screenshot 2022-12-01 at 18 52 21

mjseignon commented 1 year ago

I am successfully running mine with singularity. Have you tried switching singularity versions. In interactive I run like this:

srun --pty --time=08:00:00 bash -i module load singularity/3.8.5 steinbock="singularity run --env STEINBOCK_MASK_DTYPE=uint32 docker://ghcr.io/bodenmillergroup/steinbock:0.14.2"

mjseignon commented 1 year ago

You might also need to bind you working directory with -B option with singularity run

jwindhager commented 1 year ago

Hi @paulrbuckley-kcl, thanks for reaching out! Also, thanks @mjseignon for jumping in!

Indeed, the steinbock image uses fixuid in the entrypoint to change the container's user/group and file permissions to the UID/GID that the Docker container was started with at runtime. Singularity does not have a user/group abstraction, so this will fail with the warning you reported.

As far as I'm aware, you should be able to just ignore that warning, and everything should be working fine. If that isn't the case, you could try to override the entrypoint using Singularity's %runscript. I haven't tested this myself, but something like the following Singularity file should work:

Bootstrap: docker

From: ghcr.io/bodenmillergroup/steinbock:0.15.0

%runscript

    exec python -m steinbock "$@"

Sorry for not having an off-the-shelves solution ready for you at this point. If you decide to give the above option a try, I'd appreciate if you could report back whether this worked for you!

paulrbuckley-kcl commented 1 year ago

Hi @jwindhager and @mjseignon. Thanks both for your rapid responses.

First, @mjseignon thanks a lot! This has fixed the particular issue, in that 'preprocess' has started running. I was being dim to be honest, I had pulled the container and was trying to run the generated .sif that way. Foir whatever reason that wasn't working. Using your 'singularity run' command has got me up and running.

@jwindhager thanks very much, you are right I am able to ignore this warning.. for some reason when I pulled the sif the container wouldn't run properly. Not an issue now.

I am struggling to use data in the container. I have bound a folder with the structure data/raw/, containing .mcd and .txt files. I have also tried just with data/.mcd, as well as data/FOLDERX/.mcd. Each time i am given a warning saying no panel/.mdc/.txt found.

My command to run and bind is below. Would really appreciate some insight, and apologise if i am being dim again! I am running this in slurm with batch scripts.

steinbock="singularity run --bind /PATHX/test_data/:/data --env STEINBOCK_MASK_DTYPE=uint32 docker://ghcr.io/bodenmillergroup/steinbock:0.14.2" then $steinbock preprocess imc panel

paulrbuckley-kcl commented 1 year ago

To clarify, the folder 'test_data' contains the structure test_data/raw/.MCD - which im expecting to map in the container to /data/raw/.MCD

jwindhager commented 1 year ago

Great, @paulrbuckley-kcl, thanks for reporting back!

Regarding the problem with the data: steinbock excludes files that start with a dot . to ensure compatibility with MacOS systems ("hidden files"). The glob pattern responsible for this may also apply that filter to subdirectory names and may therefore exclude your .MCD folder entirely (would need to verify this, though - will open a separate issue). Could you try renaming .MCD to e.g. MCD and see whether this works?

paulrbuckley-kcl commented 1 year ago

sorry - me being unclear. I was referring to the .mcd file. The '.txt' or '.mcd' files are stored within test_data/raw. I am binding test_data to /data in the contrainer

I've also tried removing whitespaces in the filenames to no avail

mjseignon commented 1 year ago

I think you have to just call the directory name where the mcds are and not the mcd files themselves. so, try referencing the --mcd option with the path to that mcd directory. $steinbock preprocess imc panel --mcd path/to/mcd_dir

Also with binding path, you just need to bind you top directory. For me the path defaults to HOME without binding, so I would bind /fastscratch if that's where I wanted steinbock to run.

paulrbuckley-kcl commented 1 year ago

@mjseignon amazing, thank you. the --mcd option has got it working now. Thanks for the advice also on binding.

I was struggling to get the --help to work but i've just figured that out too. Sorry for bothering you both with simple questions!

jwindhager commented 1 year ago

No worries, glad it works now and thanks a bunch @mjseignon for jumping in better than I could!

Also, please don't hesitate to reach out again if you have questions @paulrbuckley-kcl!

mjseignon commented 1 year ago

First contribution on Github. I'm glad I could help.