enasequence / webin-cli

Webin command line submission program.
Apache License 2.0
30 stars 5 forks source link

Recurrent upload issue #86

Closed bavja-DTU closed 1 year ago

bavja-DTU commented 1 year ago

Hello

I am experiencing the same issue as a lot of users as I can see it has been raised in the past but no clear solution seems to be provided. I run the webin client with ascp which seems to be improving things a bit (with ftp, the upload is just left hanging for hours). I get the following output and error:

Command: ../../java/jdk-19.0.1/bin/java -jar ../../ena_webin/webin-cli-5.2.0.jar -context reads -manifest=manifest/ERS14365299.txt -submit -userName=XXXXXX -password=XXXXX -ascp

INFO : Your application version is 5.2.0 INFO : A dedicated submission API for COVID-19 genomes is available here: https://www.ebi.ac.uk/ena/submit/webin-cli
INFO : Submission has not been validated previously. INFO : Creating report file: /home/people/bavja/upload_ena/tanzanian_samples/manifest/./webin-cli.report INFO : Processing file /home/people/bavja/upload_ena/tanzanian_samples/raw_data/DTU2019_MG_760_3187_R1.fastq.gz INFO : Collected 100000 reads [file: /home/people/bavja/upload_ena/tanzanian_samples/raw_data/DTU2019_MG_760_3187_R1.fastq.gz] INFO : Collected 1 read labels: [1] [file: /home/people/bavja/upload_ena/tanzanian_samples/raw_data/DTU2019_MG_760_3187_R1.fastq.gz] INFO : Has possible duplicate read name(s): false [file: /home/people/bavja/upload_ena/tanzanian_samples/raw_data/DTU2019_MG_760_3187_R1.fastq.gz] INFO : Processing file /home/people/bavja/upload_ena/tanzanian_samples/raw_data/DTU2019_MG_760_3187_R2.fastq.gz INFO : Collected 100000 reads [file: /home/people/bavja/upload_ena/tanzanian_samples/raw_data/DTU2019_MG_760_3187_R2.fastq.gz] INFO : Collected 1 read labels: [2] [file: /home/people/bavja/upload_ena/tanzanian_samples/raw_data/DTU2019_MG_760_3187_R2.fastq.gz] INFO : Has possible duplicate read name(s): false [file: /home/people/bavja/upload_ena/tanzanian_samples/raw_data/DTU2019_MG_760_3187_R2.fastq.gz] INFO : The submission has been validated successfully. INFO : Invoking: ascp --file-checksum=md5 -d --mode=send --overwrite=always -QT -l300M --host=webin.ebi.ac.uk --user="Webin-37120" --src-base="/home/people/bavja/upload_ena/tanzanian_samples" --file-list="/tmp/FILE12937335390055760211LIST" "webin-cli/reads/DTU2019_MG_760_3187"

DTU2019_MG_760_3187_R1.fastq.gz 100% 4678MB 291Mb/s 02:15 DTU2019_MG_760_3187_R2.fastq.gz 100% 4794MB 275Mb/s 04:33 Completed: 9700264K bytes transferred in 273 seconds (290858K bits/sec), in 2 files. INFO : Files have been uploaded to webin2.ebi.ac.uk. ERROR: In run, alias: "webin-reads-DTU2019_MG_760_3187". File "webin-cli/reads/DTU2019_MG_760_3187/DTU2019_MG_760_3187_R2.fastq.gz" does not exist in the upload area. The submission has failed because of a system error. ERROR: In run, alias: "webin-reads-DTU2019_MG_760_3187". File "webin-cli/reads/DTU2019_MG_760_3187/DTU2019_MG_760_3187_R2.fastq.gz" does not exist in the upload area. The submission has failed because of a system error.

I am uploading 500+ novaseq sequencing runs - 400 of them are already up - but I have a 100+ that I can't get to work. I can see it might have to do with the firewall (we're looking into it) - but it is still odd as it worked for other samples with the exact same environment.

RAWWiberg commented 1 year ago

Hello. I am having a similar issue. I have large gzipped fastq files that I am trying to upload with webin-cli. the validation test works and reports no problems but when I move to submit the process just hangs forever with apparently no data movement. The odd thing is that two .bam files of similar or larger size (PacBio long read data) uploaded just fine last week. I get no error messages at all. Process has been stuck for days now.

raskoleinonen commented 1 year ago

We have deployed Webin-CLI version 6.0.0 with build-in FTP(S)/Aspera/HTTPS call retries and improved logging of these errors: https://github.com/enasequence/webin-cli/releases/tag/6.0.0

This may insulate users from transient errors and will help us to diagnose the exact problem from the logs. For example, each different type of FTP(S) error can be identified from the logs.

Please upgrade to Webin-CLI version 6.0.0 and if you have further file upload problems please contact us preferably using (fastest response time): https://www.ebi.ac.uk/ena/browser/support