Closed EricDeveaud closed 4 years ago
The ete3 package initializes it's databases in ~/.etetookit/ directory by default. Just copy all files from that directory during container build and it should work. This would allow mob_hostrange to work properly and handle taxonomy id conversions well. More on this issue can be found at https://github.com/etetoolkit/ete/issues/295
thanks but IMHO
not a suitable solution for a container. If I create the coontainer, I will have the ~/.etetookit/ directory but if someone else run the container, it will exprience the same error. it should be more suitable to "embed" the taxa.sqlite in the container
is there a way to specify a location for taxa.sqlite. env var or something else.
Would you prefer to see taxa.sqlite inside databases directory for easier container construction in case there is no Internet access? As far as I understand, the mob_init still needs to be run on a machine with Internet access and files copied to a container during image build up. What will be ideal solution for you? I am just trying to understand the context
yes container build is done on a computer with internet access and mob init is run at container build time, so there's no problem to download the files.
on our cluster compute nodes does not have access to internet so mob_init and other tools requiring network access are prone to failure.
hosting the taxa db in the container won't be a solution neither, as it needs to be writable, and container is not
ideal will be to have an option on mob_* tools that will allow user to specify the taxa.sqlite location to use. this way we could (as admin) host an manage the taxa.sqlite file installation and update.
hope that it give you a better 'appercu' of the situation.
I currently use as workaround a wrapper that claims that
before running mob-suite tools you need to copy
/some/shared/apth/taxa.sqlite to ${HOME}/.etetoolkit/taxa.sqlite
if ${HOME}/.etetoolkit/taxa.sqlite does not exists
Thank you for more details. I think if mob_suite database directory is mountable for a singularity container, then the database flag (-d) can also be used to point to that mounted writable location where all database files (mash sketch, fasta sequences,all ete3 database files including taxa.sqlite) are stored, then it will be an ideal flexible solution. This way you can initialize that database folder once and make it usable across any container. We can implement this in next release fairly soon. Thank you for a new user case. Perhaps you would be able to share new singularity recipe. Are you using Linux to build container?
great idea for the "shared folder" as it will allow for updates instead of rebuilding the conainer.
yes building containers on linux and I will provide updated recipe
I have finally moved all ete3
taxonomy database dependencies (taxa.sqlite
) to the site-packages/mob_suite/databases/
tool folder so it can be easily mountable/sharable outside the container. I like this idea as it allows to keep all databases in a single location. This would allow to run singularity containers without Internet connection and easily update databases as needed.
This functionality is available from version 2.0.2
.
It was discovered that that -d
parameter controlling custom specification of the databases
directory is not working in the mob_hostrange
module resulting in initialization of ete3
taxonomy database file taxa.sqlite
in the default mob_suite
package path (../site-packages/mob_suite/databases/
). I would like to keep all database files in a single location and allow mob_hostrange
module to accept custom location of the taxa.sqlite
passed by the -d
parameter both from the mob_hostrange
and other modules (mob_typer
and mob_recon
).
This is especially relevant for read-only containers (e.g. singularity) that do not allow write operations in container forcing to mount the default location of the already initialized databases
folder (.../lib/python3.6/site-packages/mob_suite/databases
).
Hello,
on out cluster, compute nodes does not have network acces to outside after generating a singularity container with https://github.com/phac-nml/mob-suite/blob/master/mob_suite/singularity/recipe.singularity
we have the following problem.
I try to dig on that problem, a I noticed that mob-init does not install taxdump.tar.gz here is the content of dtaabase directory
building the image we see that taxdump.tar.gz is downloaded in / see:
what must be the correct location for taxdump.tar.gz