Closed Neato-Nick closed 2 years ago
Hi
so for docker (they are stored in you "bootdisk" where var is located i think):
these images and sizes can be inspected via docker images
and you can remove them individually via docker rmi <imagename>
For a complete cleanup (removes all docker images / containers etc), do:
docker stop $(docker ps -a -q) # stop all running containers
docker rm $(docker ps -a -q) # remove all containers
docker rmi $(docker images -f "dangling=true" -q) # remove all unused image slices
docker rmi -f $(docker images -a -q) # remove all images
regarding disk space:
every file that is produced by WtP is stored here:
work/ # temporary stuff can be removed after a run
databases/ # this is where the databases are put or alternatively provide a path to them
results/ # the output-dir from --output
--output
, --databases
, --workdir
hope this helps best
Great! After the databases are downloaded, can we download the --workdir from the setup command? For me that ended up being 31G.
docker images
is really helpful. Can the sum of container space required be added to the wiki?
Totals: 31G workdir, 41G databases dir, plus the sum of these containers, I think they were all added during setup, with the exception of maybe dfam/tetools?
$ docker images --format "table {{.Repository}}\t{{.Tag}}\t{{.ID}}\t{{.CreatedAt}}\t{{.Size}}"
REPOSITORY TAG IMAGE ID CREATED AT SIZE
papanikos/marvel 0.2-29b3c73 451b8b7f09d9 2021-04-19 04:47:48 -0700 PDT 6.42GB
papanikos/virsorter-2 2.2.1--fa935f8 52548ff35f49 2021-04-16 01:21:00 -0700 PDT 1.19GB
multifractal/seeker 0.1 efe57801fbb8 2020-10-26 06:13:16 -0700 PDT 1.66GB
multifractal/phigaro 0.5.2 1c86698f8bf2 2020-09-14 04:26:27 -0700 PDT 2.6GB
dfam/tetools 1.2 9aa97b75d2c3 2020-09-09 09:57:34 -0700 PDT 3.63GB
multifractal/virnet-hack 0.1 f67323ac9dcc 2020-07-31 01:04:15 -0700 PDT 1.62GB
nanozoo/emboss 6.6.0--418c521 66fa21650fb8 2020-07-30 05:10:10 -0700 PDT 1.07GB
multifractal/ppr-meta 0.3.1 a7c3728f5bb4 2020-07-30 02:19:38 -0700 PDT 5.3GB
multifractal/virfinder 0.2 383a9764ebda 2020-07-30 00:53:38 -0700 PDT 3.91GB
multifractal/vibrant 0.5 50a55ac7e616 2020-07-30 00:42:23 -0700 PDT 1.42GB
nanozoo/sourmash 3.4.1--16a8db7 5fa1c8f40842 2020-07-25 23:55:37 -0700 PDT 788MB
nanozoo/hmmer 3.3--3db9dd1 b94ab6d4b970 2020-07-17 01:22:38 -0700 PDT 484MB
nanozoo/checkv 0.6.0--e97f45e 992e7f903edf 2020-06-02 03:19:58 -0700 PDT 1.72GB
nanozoo/altair 4.1.0--086b80e 2e4909a308d8 2020-05-12 06:41:31 -0700 PDT 1.02GB
nanozoo/samtools 1.9--76b9270 84525e422138 2020-04-03 06:57:16 -0700 PDT 487MB
multifractal/virsorter 0.1.2 807f233d65a0 2020-04-03 02:23:48 -0700 PDT 2.84GB
nanozoo/r_fungi 0.1--097b1bb 8059f16d6755 2020-02-16 06:13:02 -0800 PST 3.11GB
nanozoo/template 3.8--ccd0653 4c5ca72d30b0 2020-01-25 10:29:49 -0800 PST 681MB
hello-world latest bf756fb1ae65 2020-01-02 17:21:37 -0800 PST 13.3kB
nanozoo/basics 1.0--962b907 e6db71c4b54a 2019-12-13 02:53:58 -0800 PST 79.1MB
nanozoo/upsetr 1.4.0--0ea25b3 903ee61f2d93 2019-11-16 15:12:47 -0800 PST 3.21GB
nanozoo/r_ggplot2 0.1--6405f6d a978b0bd253e 2019-11-14 06:23:01 -0800 PST 3.26GB
multifractal/metaphinder 0.1 f6878e657670 2019-09-05 03:47:54 -0700 PDT 767MB
multifractal/deepvirfinder 0.1 14e271fb6b8e 2019-09-05 02:53:05 -0700 PDT 2.37GB
nanozoo/seqkit 0.10.1--360dd6d 19de1e6b6911 2019-08-06 07:23:11 -0700 PDT 561MB
nanozoo/prodigal 2.6.3--2769024 9a394ae3c748 2019-08-06 00:27:24 -0700 PDT 531MB
docker system df
shows the used disk space in total--workdir
is created every time you run WtP this contains logs and intermediate files. it can be tossed away after each run and it cant be "pre-downloaded" or "prepared"
I have a small SSD and a large HDD that I use for storage. I keep all my databases on the large HDD so I'm attempting to download there. But while downloading databases, filespace is getting eaten up in $HOME on my SSD at a much faster rate than where I'm downloading the databases to on HDD (/data/databases). I have added all the parameters I saw on the wiki to specify download locations, even --cachedir which I think only is applicable for singularity.
I had to restart/resume the setup a few times after clearing other files, but the setup keeps consuming more and more space. During my most recent attempt run, it consumed 7 Gb in $HOME while adding just 1 Gb to /data. Interestingly, when the setup process succesfully finished, 7G were added onto /data really quickly with no new storage used in $HOME.
Maybe something is being written invisibly to my $HOME and copied to the workdir without removing the source? I'm new to Docker, so I wouldn't be surprised if this is a config issue, but I checked /var and the docker files look small. I double-checked that there weren't files being written to /tmp/nextflow-*$USER, and it looks like --workdir is correctly preventing that. What else could that filespace be?