phac-nml / mob-suite

MOB-suite: Software tools for clustering, reconstruction and typing of plasmids from draft assemblies
Apache License 2.0
124 stars 33 forks source link

Docker Image Size #102

Closed pbelmann closed 2 years ago

pbelmann commented 2 years ago

Hi,

thank you for this tool! Your docker image with 4.71GBs of size is quite large. I guess the reason for this is that your databases are part of the image.

Is there any way that you provide a second docker image that does not download all databases?

kbessonov1984 commented 2 years ago

Hi, The image is indeed large due to use of taxonomy database inside. Other option is to have databases folder mounted in the docker image to avoid long initialization inside the container each time you boot the image. We will create a lite image. What is the reason for a smaller image size use case?

pbelmann commented 2 years ago

Well in my case the system I'm using for running my pipeline has just 20GB of disk space on the root partition. If run a pipeline with multiple docker images, then there is not enough space left. I agree that mounting your databases (which are in my case on a different partition) to a container is the way to go. Thanks! I'm looking forward to your lightweight image.

kbessonov1984 commented 2 years ago

Hello, Lite image of MOB-Suite v3.0.3 with tag kbessonov/mob_suite:3.0.3_lite without initialized databases is available on the DockerHub resource. You can pull it (docker pull kbessonov/mob_suite:3.0.3_lite) and run it docker run --rm -v $(path2databases_dir):/mnt/databases mob_recon -i assembly.fasta -o outdir_name --database_directory /mnt/databases. Note: make sure you mount the directory with databases so that mob_init routine would not be triggered causing database install inside the container.

pbelmann commented 2 years ago

Awesome! Thanks!