mazzalab / fastqwiper

An ensemble method to recover corrupted FASTQ files, drop or fix pesky lines, remove unpaired reads, and settle reads interleaving.
GNU General Public License v3.0
25 stars 3 forks source link

"Aborted!" error message #4

Closed jsalt closed 1 year ago

jsalt commented 1 year ago

Hi there,

I'm attempting to use fastqwiper on a fastq file that I know has some kind of corruption using this command:

fastqwiper \
--fastq_in cs_Annectens_MVZ_191140_ID-READ1.fastq.gz \
--fastq_out cs_Annectens_MVZ_191140_ID-READ1_wiped.fastq.gz \
--log_frequency 100000

Which resulted in this output:

Start wiping cs_Annectens_MVZ_191140_ID-READ1.fastq.gz
2023-05-23 17:04:35,302 - fastq_wiper/wiper.py(wipe_fastq) - INFO - Cleaned 100000 reads
2023-05-23 17:04:35,302 - fastq_wiper/wiper.py(wipe_fastq) - INFO - Cleaned 100000 reads
2023-05-23 17:04:35,302 - fastq_wiper/wiper.py(wipe_fastq) - INFO - Cleaned 100000 reads
2023-05-23 17:04:35,302 - fastq_wiper/wiper.py(wipe_fastq) - INFO - Cleaned 100000 reads

Aborted!

We then tried running fastqwiper (same command syntax) on a file that we know is error-free, just to make sure it's working as expected, and it completed as expected:

Successfully terminated

Wiped lines: 11707544/11707544 (100.0%)
Len(SEQ) neq Len(QUAL): 0/11707544
BAD QUAL lines: 0/11707544
BAD + lines: 0/11707544
BAD SEQ lines: 0/11707544
Not printable header lines: 0/11707544
Not printable qual lines: 0/11707544
Fixed header lines: 0/11707544
Fixed + lines: 0/11707544
Blank lines: 0/11707544
Missplaced lines: 0/11707544

I looked at the actual wiper.py code to find out more about what causes the Aborted! error, but it doesn't appear anywhere in the code. Do you have any insight into what is happening, and how we can fix it?

Thanks!

Jessie

mazzalab commented 1 year ago

Hi Jessie, can you share your data with us in order to properly check the error? (bioinformatics@css-mendel.it)

larroca commented 1 year ago

Hi,

I am having a similar issue. Aborted after it cleaned 286500000 reads in a truncated fastq.gz file. Did you get any insight from attempting to solve this issue before? Many thanks! Alberto

mazzalab commented 1 year ago

Hi Alberto,

can you share the fastq.gz file with us?

regards

-

Tommaso Mazza, Eng. Ph.D.

Laboratory of Bioinformatics, PI

Fondazione IRCCS Casa Sollievo della Sofferenza

Viale Regina Margherita 261 - 00198 Roma IT

Tel: +39 06 44160526 - Fax: +39 06 44160548

E-mail: @.***

Web page: http://www.css-mendel.it/

Web page: http://bioinformatics.css-mendel.it


From: larroca @.***> Sent: Monday, July 24, 2023 10:34 AM To: mazzalab/fastqwiper Cc: Tommaso Mazza; Comment Subject: Re: [mazzalab/fastqwiper] "Aborted!" error message (Issue #4)

Hi,

I am having a similar issue. Aborted after it cleaned 286500000 reads in a truncated fastq.gz file. Did you get any insight from attempting to solve this issue before? Many thanks! Alberto

- Reply to this email directly, view it on GitHubhttps://github.com/mazzalab/fastqwiper/issues/4#issuecomment-1647463193, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACMMOGPAQ2UHZLU43EQK5FLXRYXQRANCNFSM6AAAAAAYMTMBIU. You are receiving this because you commented.Message ID: @.***>

larroca commented 1 year ago

Thanks for looking into it, sure I can send it!

Is it sharing it via google drive a good alternative to do so?

Best,

Alberto


De: MazzaLab @.***> Enviado: lunes, 24 de julio de 2023 12:11:22 Para: mazzalab/fastqwiper Cc: Alberto Corral Lopez; Comment Asunto: Re: [mazzalab/fastqwiper] "Aborted!" error message (Issue #4)

Hi Alberto,

can you share the fastq.gz file with us?

regards

-

Tommaso Mazza, Eng. Ph.D.

Laboratory of Bioinformatics, PI

Fondazione IRCCS Casa Sollievo della Sofferenza

Viale Regina Margherita 261 - 00198 Roma IT

Tel: +39 06 44160526 - Fax: +39 06 44160548

E-mail: @.***

Web page: http://www.css-mendel.it/

Web page: http://bioinformatics.css-mendel.it


From: larroca @.***> Sent: Monday, July 24, 2023 10:34 AM To: mazzalab/fastqwiper Cc: Tommaso Mazza; Comment Subject: Re: [mazzalab/fastqwiper] "Aborted!" error message (Issue #4)

Hi,

I am having a similar issue. Aborted after it cleaned 286500000 reads in a truncated fastq.gz file. Did you get any insight from attempting to solve this issue before? Many thanks! Alberto

- Reply to this email directly, view it on GitHubhttps://github.com/mazzalab/fastqwiper/issues/4#issuecomment-1647463193, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACMMOGPAQ2UHZLU43EQK5FLXRYXQRANCNFSM6AAAAAAYMTMBIU. You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHubhttps://github.com/mazzalab/fastqwiper/issues/4#issuecomment-1647616730, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGGWAAJSPNHMCSKP4AH6XYDXRZC4VANCNFSM6AAAAAAYMTMBIU. You are receiving this because you commented.Message ID: @.***>

mazzalab commented 1 year ago

?the way you prefer is OK to us.

thanks

-

Tommaso Mazza, Eng. Ph.D.

Laboratory of Bioinformatics, PI

Fondazione IRCCS Casa Sollievo della Sofferenza

Viale Regina Margherita 261 - 00198 Roma IT

Tel: +39 06 44160526 - Fax: +39 06 44160548

E-mail: @.***

Web page: http://www.css-mendel.it/

Web page: http://bioinformatics.css-mendel.it


From: larroca @.***> Sent: Monday, July 24, 2023 1:11 PM To: mazzalab/fastqwiper Cc: Tommaso Mazza; Comment Subject: Re: [mazzalab/fastqwiper] "Aborted!" error message (Issue #4)

Thanks for looking into it, sure I can send it!

Is it sharing it via google drive a good alternative to do so?

Best,

Alberto


De: MazzaLab @.***> Enviado: lunes, 24 de julio de 2023 12:11:22 Para: mazzalab/fastqwiper Cc: Alberto Corral Lopez; Comment Asunto: Re: [mazzalab/fastqwiper] "Aborted!" error message (Issue #4)

Hi Alberto,

can you share the fastq.gz file with us?

regards

-

Tommaso Mazza, Eng. Ph.D.

Laboratory of Bioinformatics, PI

Fondazione IRCCS Casa Sollievo della Sofferenza

Viale Regina Margherita 261 - 00198 Roma IT

Tel: +39 06 44160526 - Fax: +39 06 44160548

E-mail: @.***

Web page: http://www.css-mendel.it/

Web page: http://bioinformatics.css-mendel.it


From: larroca @.***> Sent: Monday, July 24, 2023 10:34 AM To: mazzalab/fastqwiper Cc: Tommaso Mazza; Comment Subject: Re: [mazzalab/fastqwiper] "Aborted!" error message (Issue #4)

Hi,

I am having a similar issue. Aborted after it cleaned 286500000 reads in a truncated fastq.gz file. Did you get any insight from attempting to solve this issue before? Many thanks! Alberto

- Reply to this email directly, view it on GitHubhttps://github.com/mazzalab/fastqwiper/issues/4#issuecomment-1647463193, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACMMOGPAQ2UHZLU43EQK5FLXRYXQRANCNFSM6AAAAAAYMTMBIU. You are receiving this because you commented.Message ID: @.***>

- Reply to this email directly, view it on GitHubhttps://github.com/mazzalab/fastqwiper/issues/4#issuecomment-1647616730, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGGWAAJSPNHMCSKP4AH6XYDXRZC4VANCNFSM6AAAAAAYMTMBIU. You are receiving this because you commented.Message ID: @.***>

- Reply to this email directly, view it on GitHubhttps://github.com/mazzalab/fastqwiper/issues/4#issuecomment-1647707881, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACMMOGKLOARJSONWA5YVKXDXRZJ7JANCNFSM6AAAAAAYMTMBIU. You are receiving this because you commented.Message ID: @.***>

mazzalab commented 1 year ago

This error occurs for some files when executing FastWiper alone. FastWiper, in fact, needs something that the gzrt tools is execute before to remove some pesky characters. In this cases, I suggest to use the Snakemake pipeline, as described in the tools description, which should end without error.

We are preparing a Docker image to make the use of FastWiper with Snakemake easier.

mazzalab commented 1 year ago

We have made a Docker image available in DockerHub and updated the usage desciption in the home page. Please use that or the Snakemake pipelines if you are not sure your fastq file(s) are readable/not corrupted. Corruption may cause the abortion of the FastqWiper program. Pipelines/Docker image handle corrupted files also.