Closed habibr closed 10 years ago
Hi,
Is it possible to send me the files that are causing the problem or the subset of reads? I don't know what the problem is from the error message.
Thanks, Jared
On Sat, Sep 6, 2014 at 11:32 PM, Habib R notifications@github.com wrote:
I did several preqc using SGA, several finished OK, while others had something like this when generating .preqc files:
Preprocess stats: Reads parsed: 332727764 Reads kept: 331562480 (0.996498) Reads failed primer screen: 45 (1.35246e-07) Bases parsed: 32582949743 Bases kept: 32513738777 (0.997876) Number of incorrectly paired reads that were discarded: 0 [timer - sga preprocess] wall clock: 16390.13s CPU: 3783.79s [timer - sga index] wall clock: 12670.03s CPU: 36254.45s Building index for flexbar_sga.fastq.gz in memory using ropebwt done bwt construction, generating .sai file Loading FM-index of flexbar_sga.fastq.gz terminate called after throwing an instance of 'std::out_of_range' what(): basic_string::substr sga_preqc.sh: line 8: 29884 Aborted $sga preqc -t 8 flexbar_sga.fastq.gz > flexbar_sga.preqc
Could you tell me what was going wrong? The last lines in .preqc files seemed to be truncated at kmer-depth counting stats.
— Reply to this email directly or view it on GitHub https://github.com/jts/sga/issues/79.
Thanks for your reply. It was a rather huge fastq with 200 million 100bp paired reads. Unfortunately I am away this week.
Habib Rijzaani Laboratorium Genomika & Bioinformatika Bb Biogen - Balitbangtan Jl. Tentara Pelajar 3A Bogor 16111 Pada 9 Sep 2014 01:50, "Jared Simpson" notifications@github.com menulis:
Hi,
Is it possible to send me the files that are causing the problem or the subset of reads? I don't know what the problem is from the error message.
Thanks, Jared
On Sat, Sep 6, 2014 at 11:32 PM, Habib R notifications@github.com wrote:
I did several preqc using SGA, several finished OK, while others had something like this when generating .preqc files:
Preprocess stats: Reads parsed: 332727764 Reads kept: 331562480 (0.996498) Reads failed primer screen: 45 (1.35246e-07) Bases parsed: 32582949743 Bases kept: 32513738777 (0.997876) Number of incorrectly paired reads that were discarded: 0 [timer - sga preprocess] wall clock: 16390.13s CPU: 3783.79s [timer - sga index] wall clock: 12670.03s CPU: 36254.45s Building index for flexbar_sga.fastq.gz in memory using ropebwt done bwt construction, generating .sai file Loading FM-index of flexbar_sga.fastq.gz terminate called after throwing an instance of 'std::out_of_range' what(): basic_string::substr sga_preqc.sh: line 8: 29884 Aborted $sga preqc -t 8 flexbar_sga.fastq.gz
flexbar_sga.preqc
Could you tell me what was going wrong? The last lines in .preqc files seemed to be truncated at kmer-depth counting stats.
— Reply to this email directly or view it on GitHub https://github.com/jts/sga/issues/79.
— Reply to this email directly or view it on GitHub https://github.com/jts/sga/issues/79#issuecomment-54868488.
Hi Jared,
You can find a subset of the reads that has been preprocessed using sga in my git repo: https://github.com/habibr/myrepo
I did the following to preprocess the original 20 Gb gzipped reads:
sga preprocess -v -p 1 --pe-orphans=flexbar_sga_singles.fastq -m 21 -s 0.001 flexbar_1.fastq.gz flexbar_2.fastq.gz |gzip -c > flexbar_sga.fastq.gz
then i used the following commands for indexing and preqc:
sga index -a ropebwt --no-reverse -t 8 flexbar_sga.fastq.gz sga preqc -v -t 8 flexbar_sga.fastq.gz > flexbar_sga.preqc
it aborted with the same error messages
I tried the commands on both old RHEL5 and new Ubuntu 14.04 LTS with the same results.
Please find also the std_err file that shows the error messages at the end in the repository.
Habib Rijzaani
BB-Biogen, Badan Litbang Pertanian, Kementerian Pertanian Jl. Tentara Pelajar 3A Bogor 16111 +62 251 8337975
On Wed, Sep 17, 2014 at 8:39 AM, Habib Rijzaani habibrij@gmail.com wrote:
HI Jared,
here is a subset of the reads that has been preprocessed using sga.
sga preprocess -v -p 1 --pe-orphans=flexbar_sga_singles.fastq -m 21 -s 0.001 flexbar_1.fastq.gz flexbar_2.fastq.gz |gzip -c > flexbar_sga.fastq.gz
then i used the following commands:
sga index -a ropebwt --no-reverse -t 8 flexbar_sga.fastq.gz sga preqc -v -t 8 flexbar_sga.fastq.gz > flexbar_sga.preqc
it aborted with the same error messages
I tried the commands on both old RHEL5 and new Ubuntu 14.04 LTS with the same results.
Please find also the std_err file that shows the error messages at the end.
Habib Rijzaani
BB-Biogen, Badan Litbang Pertanian, Kementerian Pertanian Jl. Tentara Pelajar 3A Bogor 16111 +62 251 8337975
On Tue, Sep 9, 2014 at 1:50 AM, Jared Simpson notifications@github.com wrote:
Hi,
Is it possible to send me the files that are causing the problem or the subset of reads? I don't know what the problem is from the error message.
Thanks, Jared
On Sat, Sep 6, 2014 at 11:32 PM, Habib R notifications@github.com wrote:
I did several preqc using SGA, several finished OK, while others had something like this when generating .preqc files:
Preprocess stats: Reads parsed: 332727764 Reads kept: 331562480 (0.996498) Reads failed primer screen: 45 (1.35246e-07) Bases parsed: 32582949743 Bases kept: 32513738777 (0.997876) Number of incorrectly paired reads that were discarded: 0 [timer - sga preprocess] wall clock: 16390.13s CPU: 3783.79s [timer - sga index] wall clock: 12670.03s CPU: 36254.45s Building index for flexbar_sga.fastq.gz in memory using ropebwt done bwt construction, generating .sai file Loading FM-index of flexbar_sga.fastq.gz terminate called after throwing an instance of 'std::out_of_range' what(): basic_string::substr sga_preqc.sh: line 8: 29884 Aborted $sga preqc -t 8 flexbar_sga.fastq.gz > flexbar_sga.preqc
Could you tell me what was going wrong? The last lines in .preqc files seemed to be truncated at kmer-depth counting stats.
— Reply to this email directly or view it on GitHub https://github.com/jts/sga/issues/79.
— Reply to this email directly or view it on GitHub https://github.com/jts/sga/issues/79#issuecomment-54868488.
Thanks for the test case but it completes successfully on my machine. I ran these commands:
sga index -a ropebwt --no-reverse -t 8 flexbar_sga.fastq.gz
sga preqc -v -t 8 flexbar_sga.fastq.gz
[timer - sga::preqc] wall clock: 1468.66s CPU: 7762.26s
Hi Habib,
Is this still an issue?
Jared
Hi Jared,
maybe it was a memory issue. But I could not confirm yet. I repeated the analysis with full dataset and it ran OK. But on a machine with 6 Gb memory the issue still persisted with those particular dataset.
Ok, thanks for the update. It is probably a memory issue. I will close for now, re-open if this is an issue again.
I did several preqc using SGA, several finished OK, while others had something like this when generating .preqc files:
Preprocess stats: Reads parsed: 332727764 Reads kept: 331562480 (0.996498) Reads failed primer screen: 45 (1.35246e-07) Bases parsed: 32582949743 Bases kept: 32513738777 (0.997876) Number of incorrectly paired reads that were discarded: 0 [timer - sga preprocess] wall clock: 16390.13s CPU: 3783.79s [timer - sga index] wall clock: 12670.03s CPU: 36254.45s Building index for flexbar_sga.fastq.gz in memory using ropebwt done bwt construction, generating .sai file Loading FM-index of flexbar_sga.fastq.gz terminate called after throwing an instance of 'std::out_of_range' what(): basic_string::substr sga_preqc.sh: line 8: 29884 Aborted $sga preqc -t 8 flexbar_sga.fastq.gz > flexbar_sga.preqc
Could you tell me what was going wrong? The last lines in .preqc files seemed to be truncated at kmer-depth counting stats.