boutroslab / cld_docker

Docker for CRISPR Library Designer | https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0915-2
GNU General Public License v2.0
20 stars 4 forks source link

Docker seems to hang #9

Closed tonyreina closed 2 years ago

tonyreina commented 2 years ago

I've been trying to run the Docker to build the database. I think I am calling it correctly, but it seems to keep hanging here. I'm on a 16 GB RAM, 4 core CPU instance with 300 GB of disk space. Even after 24 hours I can't get past this step (and there doesn't seem to be any files in the mounted directory). Any thoughts on what I am doing wrong?

AWSReservedSSO_DataAnalysis:~/environment/cld_docker/etc/cld (master) $ date; docker run -v $PWD:/data boutroslab/cld_docker cld --task=make_database --output-dir=/data --organism homo_sapiens ; date
Mon May 16 02:32:26 UTC 2022
Possible precedence issue with control flow operator at /usr/local/share/perl/5.26.1/Bio/DB/IndexedBase.pm line 845.
homo_sapiens
rsync could connect
and your files are downloaded receiving incremental file list
./
CHECKSUMS
             61 100%   59.57kB/s    0:00:00 (xfr#1, to-chk=2/4)
Homo_sapiens.GRCh38.77.gtf.gz
     40,965,683 100%    1.48MB/s    0:00:26 (xfr#2, to-chk=1/4)
README
          8,371 100%  106.17kB/s    0:00:00 (xfr#3, to-chk=0/4)

sent 88 bytes  received 40,984,370 bytes  1,490,343.93 bytes/sec
total size is 40,974,115  speedup is 1.00
receiving incremental file list
./
CHECKSUMS
            118 100%  115.23kB/s    0:00:00 (xfr#1, to-chk=2/4)
Homo_sapiens.GRCh38.cdna.all.fa.gz
     59,786,104 100%    1.52MB/s    0:00:37 (xfr#2, to-chk=1/4)
README
          3,153 100%    5.40kB/s    0:00:00 (xfr#3, to-chk=0/4)

sent 182 bytes  received 59,804,223 bytes  1,553,361.17 bytes/sec
total size is 59,789,375  speedup is 1.00
receiving incremental file list
./
CHECKSUMS
         64,430 100%   61.45MB/s    0:00:00 (xfr#1, to-chk=3/5)
Homo_sapiens.GRCh38.dna.nonchromosomal.fa.gz
      3,008,709 100%    1.49MB/s    0:00:01 (xfr#2, to-chk=2/5)
Homo_sapiens.GRCh38.dna.toplevel.fa.gz
  1,011,463,198 100%    1.48MB/s    0:10:50 (xfr#3, to-chk=1/5)
README
          4,943 100%    7.19kB/s    0:00:00 (xfr#4, to-chk=0/5)

sent 225 bytes  received 1,014,789,285 bytes  1,552,853.11 bytes/sec
total size is 1,014,541,280  speedup is 1.00
All files were dowloaded
All files were unzipped
creating Fasta Database....done
All files were converted to gff
fheigwer commented 2 years ago

without having tried to reproduce the issue. From the messages, it seems that things have worked. Is the container still running on the CPU and are there any new files to be found? Which Operating system are you using on the host?

tonyreina commented 2 years ago

Yes. The container is still running (left it on overnight just in case). I was assuming that there would be some output to the local directory (in this case $PWD which mapped to /data within the Docker. However, there's no file output in the current directory.

I'm using an AWS EC2 instance running Amazon Linux (a CentOS variant).

tonyreina commented 2 years ago

It was indeed working but some steps took a long time and appeared to be hung (they were not). I created a PR to add progress bars. This makes it a little more user friendly.