Closed matthpich closed 3 years ago
Hi @matthpich , are you running it using docker? Do you expect to download all files within the container filesystem or did you configure bind mounts from the host?
Dear @tbugfinder, The files are downloaded within the container filesystem indeed, which may be the source of the problem. Could you advise on how to setup docker so that EBS autoscaling kicks in?
Are you using docker on an EC2 instance or using AWS Batch?
You'd have to use bind mount in order to map e.g. EBS /tmp to Docker /ebstmp using e.g. -v tmp:/ebstmp
. This can be setup in nextflow dockerOptions.
or
docker {
enabled = true
temp = 'auto'
}
@tbugfinder thanks for the prompt reply I am using AWS batch.
Are you using EBS auto scale with the Forge setup?
Yes I do. Is there a specific folder in the container that is auto expandable?
I missed that...
I get the same error with EBS auto-scale on.
main: line 263: 58 Killed /home/ec2-user/miniconda/bin/aws --region eu-west-3 s3 cp --only-show-errors "$source" "$target"
What could I be missing?
It should work, could you provide the stdout/err of the last execution including the dir listing and df output?
Hi @pditommaso, thanks for your message. I now run:
whoami
pwd
echo "============"
ls -lha
echo "============"
ls -lha /tmp
echo "============"
ls -lha /
echo "============"
df -h
echo "============"
ls -lha /etc/hosts
command.err is empty command.out contains:
root
/tmp/nxf.ZGvamvfBeA
============
total 2.2G
drwx------ 2 root root 4.0K Oct 8 19:42 .
drwxrwxrwt 3 root root 4.0K Oct 8 19:41 ..
-rw-r--r-- 1 root root 0 Oct 8 19:42 .command.err
-rw-r--r-- 1 root root 38 Oct 8 19:42 .command.out
-rw-r--r-- 1 root root 12K Oct 8 19:30 .command.run
-rw-r--r-- 1 root root 190 Oct 8 19:30 .command.sh
-rw-r--r-- 1 root root 0 Oct 8 19:42 .command.trace
-rw-r--r-- 1 root root 20K Oct 1 17:24 IGC.amb
-rw-r--r-- 1 root root 441M Oct 1 17:24 IGC.ann
-rw-r--r-- 1 root root 1.8G Oct 1 18:58 IGC.pac
============
total 12K
drwxrwxrwt 3 root root 4.0K Oct 8 19:41 .
drwxr-xr-x 22 root root 4.0K Oct 8 19:41 ..
drwx------ 2 root root 4.0K Oct 8 19:42 nxf.ZGvamvfBeA
============
total 80K
drwxr-xr-x 22 root root 4.0K Oct 8 19:41 .
drwxr-xr-x 22 root root 4.0K Oct 8 19:41 ..
-rw-r--r-- 1 root root 1010 Oct 8 19:42 .command.log
-rwxr-xr-x 1 root root 0 Oct 8 19:41 .dockerenv
drwxr-xr-x 2 root root 4.0K Mar 12 2020 .empty
drwxr-xr-x 2 root root 4.0K Sep 20 06:10 bin
drwxr-xr-x 2 root root 4.0K Feb 1 2020 boot
drwxr-xr-x 5 root root 340 Oct 8 19:41 dev
drwxr-xr-x 43 root root 4.0K Oct 8 19:41 etc
drwxr-xr-x 3 root root 4.0K Oct 8 19:41 home
drwxr-xr-x 8 root root 4.0K Sep 20 19:05 lib
drwxr-xr-x 2 root root 4.0K Feb 24 2020 lib64
drwxr-xr-x 2 root root 4.0K Feb 24 2020 media
drwxr-xr-x 2 root root 4.0K Feb 24 2020 mnt
drwxr-xr-x 3 root root 4.0K Mar 12 2020 opt
dr-xr-xr-x 153 root root 0 Oct 8 19:41 proc
drwx------ 3 root root 4.0K Sep 20 19:04 root
drwxr-xr-x 3 root root 4.0K Feb 24 2020 run
drwxr-xr-x 2 root root 4.0K Sep 20 06:10 sbin
drwxr-xr-x 2 root root 4.0K Feb 24 2020 srv
dr-xr-xr-x 13 root root 0 Oct 8 19:39 sys
drwxrwxrwt 3 root root 4.0K Oct 8 19:41 tmp
drwxr-xr-x 10 root root 4.0K Feb 24 2020 usr
drwxr-xr-x 11 root root 4.0K Feb 24 2020 var
============
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/docker-259:2-394429-91c88bc572c8ae08e81b0d7b40c7304180a685444e9f1812b5018e8e7c51bc82 9.8G 4.7G 4.7G 50% /
tmpfs 64M 0 64M 0% /dev
tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup
/dev/nvme2n1 50G 22M 50G 1% /etc/hosts
shm 64M 0 64M 0% /dev/shm
/dev/nvme0n1p1 7.9G 1.5G 6.3G 20% /home/ec2-user/miniconda
tmpfs 1.9G 0 1.9G 0% /proc/acpi
tmpfs 1.9G 0 1.9G 0% /sys/firmware
============
-rw-r--r-- 1 root root 126 Oct 8 19:41 /etc/hosts
Finally, command.log contains additionally:
nxf-scratch-dir ip-10-0-0-215:/tmp/nxf.ZGvamvfBeA
download failed: s3://danone-nextflow/references/IGC/bwa/IGC.sa to ./IGC.sa [Errno 28] No space left on device
download failed: s3://danone-nextflow/references/IGC/bwa/IGC.bwt to ./IGC.bwt [Errno 28] No space left on device
Weird, I'll try to replicate it
I just created a fresh account with all the prerequisites (roles, tower forged aws config, EBS autoscaling on), ran the pipeline and and ended up the same result. So I assume the problem does not come from a missing role. So unfortunately, the mystery remains...
I've isolated the problem. A patch will be available on Monday.
On Sat, 10 Oct 2020, 15:25 matthpich, notifications@github.com wrote:
I just created a fresh account with all the prerequisites (roles, tower forged aws config, EBS autoscaling on), ran the pipeline and and ended up the same result. So I assume the problem does not come from a missing role. So unfortunately, the mystery remains...
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/seqeralabs/nf-tower/issues/253#issuecomment-706549041, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGHOSHQIPAZGEJEXSJLLDTSKBOGDANCNFSM4SG5VDTQ .
There's a problem with the volume mounting. Problem solved. Let us know if now works on your side.
It does work. You made my day! Thanks a lot for your prompt help.
And, please kindly let me know if this is the way you recommend to work with large files.
cool
please kindly let me know if this is the way you recommend to work with large files.
Likely for very large dataset it may be better to configure FSx shared file system instead of transferring data from S3
Hi, I am trying to run the following pipeline:
The refbwaindex contains the following files:
The Dockerfile contains:
Yet, run from a tower forged aws environment, the index files do not seem properly staged in and the command.log contains:
Any idea how to properly load all the index files?
Many thanks for your help, and for your fantastic tools.