satijalab / azimuth-references

19 stars 10 forks source link

running Seurat+Azimuth workflow on off-line cluster (due to European Life Sciences GDPR laws) #8

Open DaniilSarkisyan opened 3 years ago

DaniilSarkisyan commented 3 years ago

Dear Andrew,

Thank you very much for providing these "bleeding edge" of using Azimuth to annotate real-life datasets (like in https://github.com/satijalab/azimuth-meta-analysis).

I need to annotate human PBMC cells from COVID patients. I was hoping to replicate your lung annotation and then modify it to work on PBMCs. (By the way, can you suggest a good example to correctly annotate these "NEW COVID-specific" celltypes?)

Due to European Life Science GDPR laws I must do it in a very restrictive off-line HPC environment, where the only way to install something is to upload a singularity image. I was able to build one from your docker://satijalab/seurat:latest where I additionally install Azimuth as R package, add snakemake (to run your workflows) and add rstudio-server (to have decent IDE).

Unfortunately, there are many versions of your docker://satijalab/azimuth-references image, and your example (https://github.com/satijalab/azimuth-meta-analysis) use "azimuth-references:vitessce" rather then "azimuth-references:latest". By looking at docker definition file, docker://satijalab/azimuth-references seems to be just a docker://satijalab/seurat plus a few more layers. Is it correct? Is it safe to substitute "satijalab/azimuth-references" where the workflow asks for "satijalab/seurat" image? If not, is there a single "definitely latest" image with Seurat + Azimuth? One image will be so much easier to containerize to singularity and use on off-line cluster...

Thank you in advance, Daniil

andrewwbutler commented 3 years ago

Hi Daniil,

For the "definitely latest" image with Seurat + Azimuth, I would build off of the highest numbered satijalab/azimuth image. The satijalab/azimuth-references images mostly just have a few extra layers on top of the azimuth images to deal with reference specific data wrangling.