enasequence / webin-cli

Webin command line submission program.
Apache License 2.0
29 stars 5 forks source link

Failed to upload files to webin.ebi.ac.uk because of a system error #53

Closed msagniez closed 1 year ago

msagniez commented 3 years ago

Hi,

I'm facing some issues while trying to upload my RNA-seq file with webin-cli tool. All my files (coupled with their own manifest files) have passed the test (with -validate option) but when changing the -validate option for -submit, the following error occurs:

INFO : The submission has been validated successfully. ERROR: Failed to upload files to webin.ebi.ac.uk using FTP. Connection refused (Connection refused) Failed to upload files to webin.ebi.ac.uk because of a system error.

I saw that this error already happened in the issues using a previous version of the tool (sequencing data submission #38) and checked both the availability of port 21 and the connection to the server via FTP. Port 21 is available and I have succesfully connected to the webin2.ebi.ac.uk server using FTP and my webin username and password.

At this point do you know what I can do to resolve this error ?

bio15anu commented 2 years ago

I am also having this issue. Is this being addressed? In the meantime could we use an older version of the webin cli tool instead?

RAWWiberg commented 2 years ago

I'm getting a similar error:

INFO : Files have been uploaded to webin2.ebi.ac.uk.
ERROR: A server error occurred when attempting to submit. The submission has failed because of a system error.

The files validate just fine. And the first line above seem to indicate the files are in fact uploaded but then something goes wrong. In the "report" file there is this:

2022-03-14T03:16:53 INFO : Files have been uploaded to webin2.ebi.ac.uk. 
2022-03-14T03:17:56 ERROR: A server error occurred when attempting to submit. The submission has failed because of a system error.
uk.ac.ebi.ena.webin.cli.WebinCliException: A server error occurred when attempting to submit. The submission has failed because of a system error.
        at uk.ac.ebi.ena.webin.cli.WebinCliException.error(WebinCliException.java:88)
        at uk.ac.ebi.ena.webin.cli.WebinCli.submit(WebinCli.java:271)
        at uk.ac.ebi.ena.webin.cli.WebinCli.execute(WebinCli.java:199)
        at uk.ac.ebi.ena.webin.cli.WebinCli.__main(WebinCli.java:96)
        at uk.ac.ebi.ena.webin.cli.WebinCli.main(WebinCli.java:77)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:48)
        at org.springframework.boot.loader.Launcher.launch(Launcher.java:87)
        at org.springframework.boot.loader.Launcher.launch(Launcher.java:50)
        at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:51)
Caused by: uk.ac.ebi.ena.webin.cli.WebinCliException: A server error occurred when attempting to submit.
        at uk.ac.ebi.ena.webin.cli.WebinCliException.systemError(WebinCliException.java:80)
        at uk.ac.ebi.ena.webin.cli.service.handler.DefaultErrorHander.handleError(DefaultErrorHander.java:45)
        at org.springframework.web.client.ResponseErrorHandler.handleError(ResponseErrorHandler.java:63)
        at org.springframework.web.client.RestTemplate.handleResponse(RestTemplate.java:777)
        at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:735)
        at org.springframework.web.client.RestTemplate.execute(RestTemplate.java:669)
        at org.springframework.web.client.RestTemplate.exchange(RestTemplate.java:578)
        at uk.ac.ebi.ena.webin.cli.service.SubmitService.doSubmission(SubmitService.java:113)
        at uk.ac.ebi.ena.webin.cli.WebinCli.submit(WebinCli.java:266)
        ... 11 common frames omitted
raskoleinonen commented 1 year ago

We have deployed Webin-CLI version 6.0.0 with build-in FTP(S)/Aspera/HTTPS call retries and improved logging of these errors: https://github.com/enasequence/webin-cli/releases/tag/6.0.0

This may insulate users from transient errors and will help us to diagnose the exact problem from the logs. For example, each different type of FTP(S) error can be identified from the logs.

Please upgrade to Webin-CLI version 6.0.0 and if you have further file upload problems please contact us preferably using (fastest response time): https://www.ebi.ac.uk/ena/browser/support

bri-risk commented 1 year ago

Hi,

It looks like this issue is still occurring with webin-cli-6.3.0. Is there an update that needs to be made? The system times out when trying to upload fastq files.

INFO : Submission has not been validated previously. INFO : Creating report file: C:\Users\beezely\Downloads.\webin-cli.report INFO : Processing file C:\Users\beezely\Downloads\forward.fastq.gz INFO : Collected 100000 reads [file: C:\Users\beezely\Downloads\forward.fastq.gz] INFO : Collected 1 read labels: [1] [file: C:\Users\beezely\Downloads\forward.fastq.gz] INFO : Has possible duplicate read name(s): false [file: C:\Users\beezely\Downloads\forward.fastq.gz] INFO : Processing file C:\Users\beezely\Downloads\reverse.fastq.gz INFO : Collected 100000 reads [file: C:\Users\beezely\Downloads\reverse.fastq.gz] INFO : Collected 1 read labels: [2] [file: C:\Users\beezely\Downloads\reverse.fastq.gz] INFO : Has possible duplicate read name(s): false [file: C:\Users\beezely\Downloads\reverse.fastq.gz] INFO : The submission has been validated successfully. INFO : Connecting to FTP server : webin2.ebi.ac.uk INFO : Uploading file: C:\Users\beezely\Downloads\forward.fastq.gz WARN : Retrying file upload to FTP server. WARN : Retrying file upload to FTP server. WARN : Retrying file upload to FTP server. WARN : Retrying file upload to FTP server. WARN : Retrying file upload to FTP server. WARN : Retrying file upload to FTP server. ERROR: Failed to upload files to webin.ebi.ac.uk using FTP. Connection or outbound has closed Failed to upload files to webin.ebi.ac.uk because of a system error. ERROR: Failed to upload files to webin.ebi.ac.uk using FTP. Connection or outbound has closed Failed to upload files to webin.ebi.ac.uk because of a system error.

raskoleinonen commented 1 year ago

Thank you for reporting the error. The retries and improved error logs give us more information about the exact problem to help us to investigate further.

KJKwon commented 1 year ago

I'm also facing the same error (webin-cli-6.3.0.)

INFO : Uploading file: /work/kkwon/20230111_Data/230106_arrived/Sample_list/Set1_forward.fastq.gz WARN : Retrying file upload to FTP server. WARN : Retrying file upload to FTP server. WARN : Retrying file upload to FTP server. WARN : Retrying file upload to FTP server. WARN : Retrying file upload to FTP server. WARN : Retrying file upload to FTP server. ERROR: Failed to upload files to webin.ebi.ac.uk using FTP. Connection or outbound has closed Failed to upload files to webin.ebi.ac.uk because of a system error. ERROR: Failed to upload files to webin.ebi.ac.uk using FTP. Connection or outbound has closed Failed to upload files to webin.ebi.ac.uk because of a system error.

raskoleinonen commented 1 year ago

The webin.ebi.ac.uk FTP(S) service is noticeably unreliable. We are revising and improving the Webin-CLI FTP(S) file upload retry logic as we recently understood that if the (socket) connection is lost retrying will not be successful without re-establishing the FTP(S) connection. We are working to implement the behavior.

tothuhien commented 1 year ago

I had the same problem when trying to upload a file of nanopore sequences about 60G to ENA. Do you have any news on this? Thanks.

bri-risk commented 1 year ago

They never gave me an answer on if the webin-cli system was fixed. However, I figured out a workaround.

Go to this site https://ena-docs.readthedocs.io/en/latest/submit/fileprep/upload.html#appendix-configuring-your-firewall-for-ena-upload and follow the directions for 'using windows file explorer'. This is essentially creating a connection between your computer and their network. Once the file was created I simply pasted in my files and that gave ENA access to the sequencing. It takes a couple hours to copy/paste large files, but this seemed to work. Just make sure to email ENA and ask for confirmation of receipt to make the files public.

Bri Risk, MS, RD, CSSD Board Certified Sports Dietitian Food Scientist/Executive Chef PhD Candidate Integrative Cardiovascular & Intestinal Health Lab Colorado State University

From: Thu-Hien To @.> Sent: Tuesday, May 9, 2023 3:24 AM To: enasequence/webin-cli @.> Cc: Risk,Bri @.>; Comment @.> Subject: Re: [enasequence/webin-cli] Failed to upload files to webin.ebi.ac.uk because of a system error (#53)

Caution: EXTERNAL Sender

I had the same problem when trying to upload a file of nanopore sequences about 60G to ENA. Do you have any news on this? Thanks.

- Reply to this email directly, view it on GitHubhttps://github.com/enasequence/webin-cli/issues/53#issuecomment-1539556184, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A7EJOYYDX5LPH3LUWSLFISDXFIELTANCNFSM47VKWCTQ. You are receiving this because you commented.Message ID: @.**@.>>

raskoleinonen commented 1 year ago

Webin-CLI uses FTP(S) or Aspera. The FTP(S) is known to have two main issues: (1) reliability problems and (2) firewall problems. The Aspera is understood to be more reliable.

We have worked to improve the FTP issue (1) by incrementally improving FTP retries in Webin-CLI. I think that the Webin-CLI 6.4.0 version now does everything it can including re-connecting if the FTP connection was lost for any reason (typically manifested as a socket error) while maintaining its state over the re-connection attempt.

For the (2) firewall errors, we have a limited understanding ourselves how the firewalls may affect the FTP protocol or how to best resolve these problems. We acknowledge that an HTTP(S) based file upload protocol would be preferred, unfortunately, we are limited to using FTP(S) or Aspera at this time in the file upload backend.

saat-mics commented 1 year ago

Hi,

I am using Webin-Cli v6.4.1

and still getting same error:

INFO: Using cached SIF image INFO : Your application version is 6.4.1 INFO : If you are using INSDC missing value terms (https://www.insdc.org/submitting-standards/missing-value-reporting/) please upgrade to Webin-CLI version 6.4.1 or later to avoid validation errors for country, collection_date, and lat_lon source feature qualifiers. The missing value terms are not yet supported by the flat file feature table. Webin-CLI version 6.4.1 will ignore missing value terms to avoid these errors. INFO : Submission has not been validated previously. INFO : Creating report file: /crex/proj/uppstore2017065/delivery02342/INBOX/P13257/P13257_137/02-FASTQ/190628_A00689_0043_BHKF75DSXX/./webin-cli.report INFO : Processing file /crex/proj/uppstore2017065/delivery02342/INBOX/P13257/P13257_137/02-FASTQ/190628_A00689_0043_BHKF75DSXX/P13257_137_S39_L001_R1_001.fastq.gz INFO : Collected 100000 reads [file: /crex/proj/uppstore2017065/delivery02342/INBOX/P13257/P13257_137/02-FASTQ/190628_A00689_0043_BHKF75DSXX/P13257_137_S39_L001_R1_001.fastq.gz] INFO : Collected 1 read labels: [1] [file: /crex/proj/uppstore2017065/delivery02342/INBOX/P13257/P13257_137/02-FASTQ/190628_A00689_0043_BHKF75DSXX/P13257_137_S39_L001_R1_001.fastq.gz] INFO : Has possible duplicate read name(s): false [file: /crex/proj/uppstore2017065/delivery02342/INBOX/P13257/P13257_137/02-FASTQ/190628_A00689_0043_BHKF75DSXX/P13257_137_S39_L001_R1_001.fastq.gz] INFO : The submission has been validated successfully. INFO : Connecting to FTP server : webin2.ebi.ac.uk WARN : Retrying connecting to FTP server. WARN : Retrying connecting to FTP server. WARN : Retrying connecting to FTP server. WARN : Retrying connecting to FTP server. WARN : Retrying connecting to FTP server. WARN : Retrying connecting to FTP server. ERROR: Failed to connect to webin.ebi.ac.uk using FTP. Failed to upload files to webin.ebi.ac.uk because of a system error. ERROR: Failed to connect to webin.ebi.ac.uk using FTP. Failed to upload files to webin.ebi.ac.uk because of a system error.

Any suggestion on how to make it run?

thanks, Atal

raskoleinonen commented 1 year ago

If your organisation is using a proxy following these instructions might help: https://ena-docs.readthedocs.io/en/latest/submit/general-guide/webin-cli.html#configuring-your-firewall-for-ena-upload

The problem may also be related to firewall settings. In this case, the following instruction might help: https://ena-docs.readthedocs.io/en/latest/submit/fileprep/upload.html#appendix-configuring-your-firewall-for-ena-upload

You could also try connecting directly using FTP(S) to webin.ebi.ac.uk using your Webin-N account name as the username with the Webin submission account password. If you can connect directly using FTP(S) and not using Webin-CLI we could investigate what is the difference between the two approaches.

bri-risk commented 1 year ago

Thank you for the email Mr. Leinonen. I conferred with the university IT department and they do not have firewalls set for outbound data, which I'd imagine is for instances such as this. Were there glitches in the software, or is the program currently functioning properly?

Bri Risk, MS, RD, CSSD Board Certified Sports Dietitian Food Scientist/Executive Chef PhD Candidate Integrative Cardiovascular & Intestinal Health Lab Colorado State University

From: Rasko Leinonen @.> Sent: Friday, June 9, 2023 7:30 AM To: enasequence/webin-cli @.> Cc: Risk,Bri @.>; Comment @.> Subject: Re: [enasequence/webin-cli] Failed to upload files to webin.ebi.ac.uk because of a system error (#53)

Caution: EXTERNAL Sender

If your organisation is using a proxy following these instructions might help: https://ena-docs.readthedocs.io/en/latest/submit/general-guide/webin-cli.html#configuring-your-firewall-for-ena-upload

The problem may also be related to firewall settings. In this case, the following instruction might help: https://ena-docs.readthedocs.io/en/latest/submit/fileprep/upload.html#appendix-configuring-your-firewall-for-ena-upload

- Reply to this email directly, view it on GitHubhttps://github.com/enasequence/webin-cli/issues/53#issuecomment-1584579162, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A7EJOY7KOR4C2UY5RWCO4XTXKMQL5ANCNFSM47VKWCTQ. You are receiving this because you commented.Message ID: @.**@.>>

raskoleinonen commented 1 year ago

The Webin-CLI FTP code has been incrementally improved and should now have all possible FTP retries and error-handling routines.

However, the underlying webin.ebi.ac.uk FTPS service may be unavailable at times. E.g. FTPS user authentication did not recently work for a small number of users for an extended period of time. If the Webin-CLI does not work, and especially if direct use of the webin.ebi.ac.uk FTP(S) fails as well, then there is likely to be a problem with the webin.ebi.ac.uk FTPS service that needs to be addressed.

saat-mics commented 1 year ago

Hi,

Thanks for your replies.

I was able to fix the problem following your suggestions and was able to upload data for 6 individuals (out of 60), but then getting another error for rest of the individuals:

ERROR: In run, alias: "webin-reads-P9721_143". Read type information missing in run. The submission has failed because of a system error. ERROR: In run, alias: "webin-reads-P9721_143". Read type information missing in run. The submission has failed because of a system error.

I find no reason for suddenly not working my script. could you please have a look on this?

thanks, Atal

bri-risk commented 1 year ago

Hi Atal,

I don't work for ENA so I'm not sure how to address your issue.

Bri Risk, MS, RD, CSSD Board Certified Sports Dietitian Food Scientist/Executive Chef PhD Candidate Integrative Cardiovascular & Intestinal Health Lab Colorado State University

From: saat-mics @.> Sent: Wednesday, July 19, 2023 8:45 AM To: enasequence/webin-cli @.> Cc: Risk,Bri @.>; Comment @.> Subject: Re: [enasequence/webin-cli] Failed to upload files to webin.ebi.ac.uk because of a system error (#53)

Caution: EXTERNAL Sender

Hi,

Thanks for your replies.

I was able to fix the problem following your suggestions and was able to upload data for 6 individuals (out of 60), but then getting another error for rest of the individuals:

ERROR: In run, alias: "webin-reads-P9721_143". Read type information missing in run. The submission has failed because of a system error. ERROR: In run, alias: "webin-reads-P9721_143". Read type information missing in run. The submission has failed because of a system error.

I find no reason for suddenly not working my script. could you please have a look on this?

thanks, Atal

- Reply to this email directly, view it on GitHubhttps://github.com/enasequence/webin-cli/issues/53#issuecomment-1642228597, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A7EJOYZTCIMV2SU7CA5B2T3XQ7XGVANCNFSM47VKWCTQ. You are receiving this because you commented.Message ID: @.**@.>>

raskoleinonen commented 1 year ago

Dear Atal,

The "Read type information missing in run" error indicates that more than two Fastq files are included in one Webin-CLI reads submission or in one Webin REST run XML submission.

When using Webin-CLI, please use a JSON manifest as described here: https://ena-docs.readthedocs.io/en/latest/submit/reads/webin-cli.html#json-manifest-file-format

The JSON read_type attribute supports the following values:

Best wishes, Rasko

saat-mics commented 1 year ago

Thanks a lot, that solved the problem.

Cheers, Atal

From: Rasko Leinonen @.> Sent: Monday, July 24, 2023 1:05 PM To: enasequence/webin-cli @.> Cc: Atal Saha @.>; Comment @.> Subject: Re: [enasequence/webin-cli] Failed to upload files to webin.ebi.ac.uk because of a system error (#53)

Dear Atal,

The "Read type information missing in run" error indicates that more than two Fastq files are included in one Webin-CLI reads submission or in one Webin REST run XML submission.

When using Webin-CLI, please use a JSON manifest as described here: https://ena-docs.readthedocs.io/en/latest/submit/reads/webin-cli.html#json-manifest-file-format

The JSON read_type attribute supports the following values:

Best wishes, Rasko

— Reply to this email directly, view it on GitHubhttps://github.com/enasequence/webin-cli/issues/53#issuecomment-1647697868, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A6CWZ24IXHPMV4IRNXA3NJLXRZJFJANCNFSM47VKWCTQ. You are receiving this because you commented.Message ID: @.***>

saat-mics commented 1 year ago

Dear Rasko,

Sorry to bother you again with this.

I am now trying to use JSON manifest file for the first time, and the instructions given in the manual is not really helping me. Here is my JSON file (manifest.json):

{ "study": PRJEB62707 "sample": ERS15574205 "name": P9721_143 "instrument": Illumina NovaSeq 6000 "insert_size": 350 "library_source": GENOMIC "library_selection": RANDOM "library_strategy": WGS "fastq": [ { "value": "P9721_143_S43_L001_R1_001.fastq.gz", "attributes": { "read_type": "paired", "sample_barcode" } }, { "value": " P9721_143_S43_L001_R2_001.fastq.gz", "attributes": { "read_type": "paired", "sample_barcode" } }, { "value": "P9721_143_S43_L002_R1_001.fastq.gz", "attributes": { "read_type": "paired", "sample_barcode" } }, { value: "P9721_143_S43_L002_R2_001.fastq.gz", "attributes": { "read_type": "paired", "sample_barcode" } } ] }

The error message I am getting is as beow:

"INFO : Your application version is 6.5.0 INFO : A new application version is available. Please download the latest version 6.5.1 from https://github.com/enasequence/webin-cli/releases INFO : If you are using INSDC missing value terms (https://www.insdc.org/submitting-standards/missing-value-reporting/) please upgrade to Webin-CLI version 6.4.1 or later to avoid validation errors for country, collection_date, and lat_lon source feature qualifiers. The missing value terms are not yet supported by the flat file feature table. Webin-CLI version 6.4.1 will ignore missing value terms to avoid these errors. ERROR: Malformed manifest file content. [manifest file: /crex/proj/uppstore2017065/raw_data/L.Laikre_17_01_indlcWGS/P9721/P9721_143/02-FASTQ/180306_A00187_0016_BH5WLLDMXX/manifest.json] INFO : Creating report file: /crex/proj/uppstore2017065/raw_data/L.Laikre_17_01_indlcWGS/P9721/P9721_143/02-FASTQ/180306_A00187_0016_BH5WLLDMXX/./webin-cli.report ERROR: Missing mandatory field NAME. [manifest file: /crex/proj/uppstore2017065/raw_data/L.Laikre_17_01_indlcWGS/P9721/P9721_143/02-FASTQ/180306_A00187_0016_BH5WLLDMXX/manifest.json] ERROR: Missing mandatory field STUDY. [manifest file: /crex/proj/uppstore2017065/raw_data/L.Laikre_17_01_indlcWGS/P9721/P9721_143/02-FASTQ/180306_A00187_0016_BH5WLLDMXX/manifest.json] ERROR: Missing mandatory field SAMPLE. [manifest file: /crex/proj/uppstore2017065/raw_data/L.Laikre_17_01_indlcWGS/P9721/P9721_143/02-FASTQ/180306_A00187_0016_BH5WLLDMXX/manifest.json] ERROR: Missing mandatory field LIBRARY_SOURCE. [manifest file: /crex/proj/uppstore2017065/raw_data/L.Laikre_17_01_indlcWGS/P9721/P9721_143/02-FASTQ/180306_A00187_0016_BH5WLLDMXX/manifest.json] ERROR: Missing mandatory field LIBRARY_SELECTION. [manifest file: /crex/proj/uppstore2017065/raw_data/L.Laikre_17_01_indlcWGS/P9721/P9721_143/02-FASTQ/180306_A00187_0016_BH5WLLDMXX/manifest.json] ERROR: Missing mandatory field LIBRARY_STRATEGY. [manifest file: /crex/proj/uppstore2017065/raw_data/L.Laikre_17_01_indlcWGS/P9721/P9721_143/02-FASTQ/180306_A00187_0016_BH5WLLDMXX/manifest.json] ERROR: No data files have been specified. Expected data files are: [1-10 FASTQ] or [1 CRAM] or [1 BAM]. [manifest file: /crex/proj/uppstore2017065/raw_data/L.Laikre_17_01_indlcWGS/P9721/P9721_143/02-FASTQ/180306_A00187_0016_BH5WLLDMXX/manifest.json] ERROR: Platform and/or instrument should be defined. Valid platforms: ILLUMINA, PACBIO_SMRT, OXFORD_NANOPORE, BGISEQ, LS454, ION_TORRENT, CAPILLARY, DNBSEQ, ELEMENT, ULTIMA. Valid instruments: HiSeq X Five, HiSeq X Ten, Illumina Genome Analyzer, Illumina Genome Analyzer II, Illumina Genome Analyzer IIx, Illumina HiScanSQ, Illumina HiSeq 1000, Illumina HiSeq 1500, Illumina HiSeq 2000, Illumina HiSeq 2500, Illumina HiSeq 3000, Illumina HiSeq 4000, Illumina HiSeq X, Illumina iSeq 100, Illumina MiSeq, Illumina MiniSeq, Illumina NovaSeq 6000, Illumina NovaSeq X, NextSeq 500, NextSeq 550, NextSeq 1000, NextSeq 2000, MinION, GridION, PromethION, Onso, PacBio RS, PacBio RS II, Revio, Sequel, Sequel II, Sequel IIe, BGISEQ-50, BGISEQ-500, MGISEQ-2000RS, 454 GS, 454 GS 20, 454 GS FLX, 454 GS FLX+, 454 GS FLX Titanium, 454 GS Junior, Ion Torrent Genexus, Ion Torrent PGM, Ion Torrent Proton, Ion Torrent S5, Ion Torrent S5 XL, Ion GeneStudio S5, Ion GeneStudio S5 Plus, Ion GeneStudio S5 Prime, AB 3730xL Genetic Analyzer, AB 3730 Genetic Analyzer, AB 3500xL Genetic Analyzer, AB 3500 Genetic Analyzer, AB 3130xL Genetic Analyzer, AB 3130 Genetic Analyzer, AB 310 Genetic Analyzer, DNBSEQ-T7, DNBSEQ-G400, DNBSEQ-G50, DNBSEQ-G400 FAST, Element AVITI, UG 100, unspecified. [manifest file: /crex/proj/uppstore2017065/raw_data/L.Laikre_17_01_indlcWGS/P9721/P9721_143/02-FASTQ/180306_A00187_0016_BH5WLLDMXX/manifest.json] ERROR: Invalid manifest file. Please see the error report file "/crex/proj/uppstore2017065/raw_data/L.Laikre_17_01_indlcWGS/P9721/P9721_143/02-FASTQ/180306_A00187_0016_BH5WLLDMXX/./manifest.json.report". ERROR: Invalid manifest file. Please see the error report file "/crex/proj/uppstore2017065/raw_data/L.Laikre_17_01_indlcWGS/P9721/P9721_143/02-FASTQ/180306_A00187_0016_BH5WLLDMXX/./manifest.json.report". "

Seems to be something wrong with my manifest file. Any help on this, please?

thanks, Atal

raskoleinonen commented 1 year ago

Hi Atal,

I will shortly update the instructions page to show:

{
 "study": "TODO",
 "sample": "TODO",
 "name": "TODO",
 "platform": "TODO",
 "instrument": "TODO",
 "insert_size": "TODO",
 "library_name": "TODO",
 "library_source": "TODO",
 "library_selection": "TODO",
 "library_strategy": "TODO",

Could you please first change the start of your JSON document to:

"study": ,"PRJEB62707",
"sample": "ERS15574205",
"name": "P9721_143",
"instrument": "Illumina NovaSeq 6000",
"insert_size": "350",
"library_source": "GENOMIC",
"library_selection": "RANDOM",
"library_strategy": "WGS",

You are missing parenthesis around the values and a comma after each field.

saat-mics commented 1 year ago

Thanks again, Rasko.

I have just added parenthesis and commas, but still getting same error. My script looks like below now:

{ "study": "PRJEB62707", "sample": "ERS15574205", "name": "P9721_143", "instrument": "Illumina NovaSeq 6000", "insert_size": "350", "library_source": "GENOMIC", "library_selection": "RANDOM", "library_strategy": "WGS", "fastq": [ { "value": "P9721_143_S43_L001_R1_001.fastq.gz", "attributes": { ""read_type": "paired", "sample_barcode" } }, { "value": " P9721_143_S43_L001_R2_001.fastq.gz", "attributes": { "read_type": "paired", "sample_barcode" } }, { "value": "P9721_143_S43_L002_R1_001.fastq.gz", "attributes": { "read_type": "paired", "sample_barcode" } }, { value: "P9721_143_S43_L002_R2_001.fastq.gz", "attributes": { "read_type": "paired", "sample_barcode" } } ] }

raskoleinonen commented 1 year ago

Hi Atal,

There are several JSON formatting errors. The following is valid JSON, however, there is another issue I will explain below:

{
"study": "PRJEB62707",
"sample": "ERS15574205",
"name": "P9721_143",
"instrument": "Illumina NovaSeq 6000",
"insert_size": "350",
"library_source": "GENOMIC",
"library_selection": "RANDOM",
"library_strategy": "WGS",
"fastq": [
{
"value": "P9721_143_S43_L001_R1_001.fastq.gz",
"attributes": {
"read_type": ["paired", "sample_barcode"]
}
},
{
"value": "P9721_143_S43_L001_R2_001.fastq.gz",
"attributes": {
"read_type": ["paired", "sample_barcode"]
}
},
{
"value": "P9721_143_S43_L002_R1_001.fastq.gz",
"attributes": {
"read_type": ["paired", "sample_barcode"]
}
},
{
"value": "P9721_143_S43_L002_R2_001.fastq.gz",
"attributes": {
"read_type": ["paired", "sample_barcode"]
}
}
]
}

The "paired" read_type can only be used for exactly two fastq files in one reads context submission. You might wish to create two reads submissions, one for P9721_143_S43_L001 and another for P9721_143_S43_L002.

raskoleinonen commented 1 year ago

Also, please note that if the JSON is not correctly formatted you see the following error:

ERROR: Malformed manifest file content

This error should be fixed before looking at any other JSON manifest-related errors.

saat-mics commented 1 year ago

Rasko, Thanks again for your help with this. I am no more getting the same error.

However, there are still errors. I have data for same individuals coming from 2 or more lanes. As you suggested, I created two reads submissions, one for P9721_143_S43_L001 and another for P9721_143_S43_L002.

but now that gives me other error:

ERROR: In experiment, alias: "webin-reads-P9721_143". The object being added already exists in the submission account with accession: "ERX11157306". The submission has failed because of a system error. ERROR: In experiment, alias: "webin-reads-P9721_143". The object being added already exists in the submission account with accession: "ERX11157306". The submission has failed because of a system error.

The reason I am trying to use manifest.json is that I have more than 2 fastq files under same accession. How do I solve this problem, please?

thanks, Atal

raskoleinonen commented 1 year ago

Hi Atal,

This error means that a reads submission has already been made with the same name using the submission account Webin-57425:

{
...
"name": "P9721_143",
...

Names are unique to prevent accidental re-submissions. Could you please make sure that all names are unique?

saat-mics commented 1 year ago

Hi Rasko,

Thanks for your reply.

I will try to explain it here again. So, for sample "P9721_143", I have 4 fastq files: P9721_143_S43_L001_R1_001.fastq.gz P9721_143_S43_L001_R2_001.fastq.gz P9721_143_S43_L002_R1_001.fastq.gz P9721_143_S43_L002_R2_001.fastq.gz

if i submit these files in 2 rounds, I still use the same "name" as these data belong to same individual. Should it not be the way to do it?

Atal

raskoleinonen commented 1 year ago

Hi Atal,

The name needs to be unique within a submission account so that each read submission has a different name. The name is not the sample name but the name for the files being submitted. e.g. P9721_143_S43_L001 and P9721_143_S43_L002 could be possible names to use.

saat-mics commented 1 year ago

thanks again, Rasko.

So, this means that I will have to have accession id both for P9721_143_S43_L001 and P9721_143_S43_L002? As the same file name can be used for forward and reverse reads, I thought that would be possible even for reads from different lanes.

I have data from 60 individuals and I intended to use 60 different 'names', and 'id' for those 60 names have already been generated. but what i understand now is that I will need 120 ids. Probably its better to regenerate ids for files?

thanks, Atal

raskoleinonen commented 1 year ago

Hi Atal,

if you have paired reads for 60 individuals, you would make 60 submissions with 60 unique "names"s. Each submission would contain two Fastq files representing the two reads in the pair.

Yes to -> have (unique and different) accession id for P9721_143_S43_L001 and P9721_143_S43_L002

Best wishes, Rasko

saat-mics commented 1 year ago

Thanks very much, Rasko.

Samples were sequenced in 2 lanes (paired-end run). I am doing it like P9721_143_S43_L001 and P9721_143_S43_L002, and that seems working.

Thanks again, Atal

marwa38 commented 9 months ago

They never gave me an answer on if the webin-cli system was fixed. However, I figured out a workaround. Go to this site https://ena-docs.readthedocs.io/en/latest/submit/fileprep/upload.html#appendix-configuring-your-firewall-for-ena-upload and follow the directions for 'using windows file explorer'. This is essentially creating a connection between your computer and their network. Once the file was created I simply pasted in my files and that gave ENA access to the sequencing. It takes a couple hours to copy/paste large files, but this seemed to work. Just make sure to email ENA and ask for confirmation of receipt to make the files public. Bri Risk, MS, RD, CSSD Board Certified Sports Dietitian Food Scientist/Executive Chef PhD Candidate Integrative Cardiovascular & Intestinal Health Lab Colorado State University From: Thu-Hien To @.> Sent: Tuesday, May 9, 2023 3:24 AM To: enasequence/webin-cli @.> Cc: Risk,Bri @.>; Comment @.> Subject: Re: [enasequence/webin-cli] Failed to upload files to webin.ebi.ac.uk because of a system error (#53) Caution: EXTERNAL Sender I had the same problem when trying to upload a file of nanopore sequences about 60G to ENA. Do you have any news on this? Thanks. - Reply to this email directly, view it on GitHub<#53 (comment)>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A7EJOYYDX5LPH3LUWSLFISDXFIELTANCNFSM47VKWCTQ. You are receiving this because you commented.Message ID: @.**@.>>

I wonder if md5 is required here? and what about metadata?

jorondo1 commented 3 months ago

Dear @raskoleinonen ,

I am getting a similar error than the one originally posted.

INFO : Your application version is 7.2.1                                                                                                                                                
INFO : Please upgrade to Webin-CLI version 7.0.1 or later if you see the following error: Failed to initialise validator. Could not retrieve BioSample.                                 
INFO : Connecting to FTP server : webin2.ebi.ac.uk                                                                                                                                      
INFO : Creating report file: /home/def-ilafores/analysis/boreal_moss/ENA_submission/run_manifests/./webin-cli.report                                                                    
INFO : Uploading file: /home/def-ilafores/analysis/boreal_moss/preproc/S-11-POLJUN-G/S-11-POLJUN-G_paired_1.fastq.gz                                                                    
INFO : Uploading file: /home/def-ilafores/analysis/boreal_moss/preproc/S-11-POLJUN-G/S-11-POLJUN-G_unmatched_2.fastq.gz                                                                 
INFO : Uploading file: /home/def-ilafores/analysis/boreal_moss/preproc/S-11-POLJUN-G/S-11-POLJUN-G_unmatched_1.fastq.gz                                                                 
INFO : Uploading file: /home/def-ilafores/analysis/boreal_moss/preproc/S-11-POLJUN-G/S-11-POLJUN-G_paired_2.fastq.gz                                                                    
INFO : Files have been uploaded to webin2.ebi.ac.uk.                                                                                                                                    
ERROR: In run, alias: "webin-reads-Boreal Moss Microbiome S-11-POLJUN-G". Read type information missing in run. The submission has failed because of a system error.                    
ERROR: Some or all submissions failed. Please see application logs.                                                                                                                     

Before asking my sysadmin to address the firewall, I want to validate something regarding your comment earlier:

The "Read type information missing in run" error indicates that more than two Fastq files are included in one Webin-CLI reads submission or in one Webin REST run XML submission.

I have used both the paired and unpaired sections of my reads for assembly, so naturally I want to submit all 4 fastqs for each samples (paired_1, paired_2, unpaired_1 and unpaired_2)

From reading the ENA docs I added four FASTQ fields in my manifest. Is this a good way to do it? or is there another way to submit four files per sample?

Thanks in advance

mairaihsann commented 3 months ago

Hi @jorondo1,

You can only submit 2 fastq files (forward R1 and reverse R2) for a paired fastq file submission. The manifest file only accepts a maximum of 2 FASTQ fields. Any unpaired reads can be included within the forward and reverse files - so your R1 file can contain the unpaired_1 sequences and the R2 file can contain the unpaired_2 sequences. When these files are processed, the ENA generates a forward file and reverse file (ERRxxx_1.fastq.gz and ERRxxx_2.fastq.gz) for the paired reads and a third file (ERRxxx_3.fastq.gz) containing only the unpaired reads.

The only requirement in this case for paired read submissions is that the pairing percentage should not be less than 20%. Please also make sure your fastq files follow the format described here: https://ena-docs.readthedocs.io/en/latest/submit/fileprep/reads.html#fastq-format

Thanks, Maira