Closed klaas-men closed 3 years ago
Thanks for reporting this. I’ll take a look at it as I make updates to SNAP.
--Bill
From: klaas-men notifications@github.com Sent: Monday, April 16, 2018 6:18 AM To: amplab/snap snap@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [amplab/snap] Snap aligner FASTQ record larger than buffer (#113)
Hi developers of snap,
Introduction
I am a bioinformatician that works on Applied-Maths. I really like snap for mapping reads to bacterial genomes! I congratulate you guys on this most outstanding mapper.
Problem
I run my mapping jobs on a server. When I submit a job to the server, it gzip's the fastqs for faster transfer. However, when I ran snap on the server, it gave an error on these fastq.gz files: "FASTQ record larger than buffer size at /../test_2.fastq.gz:4885027"
I suppose this error is similar to https://www.biostars.org/p/278787/https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.biostars.org%2Fp%2F278787%2F&data=02%7C01%7Cbolosky%40microsoft.com%7C597a19358f5849446e7d08d5a39c75aa%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636594814802773764&sdata=8rDbwCO3MeH7A6NInAdz3E0lSwTaEEaywkW4TzVAnEg%3D&reserved=0
Solution
When I tried to replicate the mapping on my own local linux system I however did not got any errors. In order to to pinpoint the problem, I compared the fastq's on my own system and those on the servers with md5sum and diff and it appeared that my code had removed the newline on the end of the fastq-file before compressing it with gzip and sending it to the server - my bad. When I re-added this newline at the end of the fastq before compressing, the problem was gone.
I will change my own code, but I also think it might be good for snap that it also accepts fastq.gz-files without a newline on the end of the fastq.
Keep up the good work! Klaas Mensaert
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Famplab%2Fsnap%2Fissues%2F113&data=02%7C01%7Cbolosky%40microsoft.com%7C597a19358f5849446e7d08d5a39c75aa%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636594814802773764&sdata=RKC6lmQMn5L9Nh8xL39CXkF3pYg%2FJP5GbAbjud8FtKE%3D&reserved=0, or mute the threadhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAA752bVNlhVfWaHWPdNilm92wt6snKLnks5tpJn_gaJpZM4TWgcX&data=02%7C01%7Cbolosky%40microsoft.com%7C597a19358f5849446e7d08d5a39c75aa%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636594814802773764&sdata=byoPmNazo7vG%2BnHpBX7ILyWW2jUjcSR%2BFiBY%2FKESprY%3D&reserved=0.
Any updates to this? I encounter the same error when I try to align nanopore reads. I'm also using gzipped fastq inputs, and I found some older issues online suggesting to decompress the files first, but the SNAP website says it supports gzipped fastq so I don't think this is the problem. For reference, I am using SNAP 1.0beta.18 for Linux (64-bit), and I get the following error:
Loading index from directory... 0s. 17414383 bases, seed size 22
Aligning.
FASTQ record larger than buffer size at /home/kchan/thesis/raw_data/SRR7690687.fastq.gz:8388608
SNAP exited with exit code 1 from line 255 of file SNAPLib/FASTQ.cpp
This is an issue with handling very long reads. If a single read is longer than the IO buffer, then this will happen.
You can fix this in one of two ways. Either define LONG_READS at the beginning of Read.h in snaplib (that is, take out the two slashes before the line //#define LONG_READS) or else increase the max read size by updating MAX_READ_LENGTH just below.
Either of these things will increase the amount of memory SNAP uses, because it allocates buffer space for the maximum read length (that’s also why it’s a compile-time rather than command line option). But if you’re doing nanopore reads, you’ll need to, since they’re typically quite long.
--Bill
From: Kevin Chan notifications@github.com Sent: Friday, December 21, 2018 5:04 PM To: amplab/snap snap@noreply.github.com Cc: Bill Bolosky bolosky@microsoft.com; Comment comment@noreply.github.com Subject: Re: [amplab/snap] Snap aligner FASTQ record larger than buffer (#113)
Any updates to this? I encounter the same error when I try to align nanopore reads. I'm also using gzipped fastq inputs, and I found some older issues online suggesting to decompress the files first, but the SNAP website says it supports gzipped fastq so I don't think this is the problem. For reference, I am using SNAP 1.0beta.18 for Linux (64-bit), and I get the following error:
Loading index from directory... 0s. 17414383 bases, seed size 22
Aligning.
FASTQ record larger than buffer size at /home/kchan/thesis/raw_data/SRR7690687.fastq.gz:8388608
SNAP exited with exit code 1 from line 255 of file SNAPLib/FASTQ.cpp
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Famplab%2Fsnap%2Fissues%2F113%23issuecomment-449533068&data=02%7C01%7Cbolosky%40microsoft.com%7C1558c69a7c9142b40a8f08d667a952a5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636810374268913200&sdata=qVruziasBFW6NczzivwiKIlp54NYLzPedEJxk7O5K1g%3D&reserved=0, or mute the threadhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAA752Tm_FVhmYZ9jT43GuQwm3L5MA8VWks5u7YTwgaJpZM4TWgcX&data=02%7C01%7Cbolosky%40microsoft.com%7C1558c69a7c9142b40a8f08d667a952a5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636810374268913200&sdata=nR6Ud%2B%2B1e7iyp1HmKSHrRuBToCKd7cRhwnq8SLXT7iE%3D&reserved=0.
Hi developers of snap,
Introduction
I am a bioinformatician that works on Applied-Maths. I really like snap for mapping reads to bacterial genomes! I congratulate you guys on this most outstanding mapper.
Problem
I run my mapping jobs on a server. When I submit a job to the server, it gzip's the fastqs for faster transfer. However, when I ran snap on the server, it gave an error on these fastq.gz files: "FASTQ record larger than buffer size at /../test_2.fastq.gz:4885027"
I suppose this error is similar to https://www.biostars.org/p/278787/
Solution
When I tried to replicate the mapping on my own local linux system I however did not got any errors. In order to to pinpoint the problem, I compared the fastq's on my own system and those on the servers with md5sum and diff and it appeared that my code had removed the newline on the end of the fastq-file before compressing it with gzip and sending it to the server - my bad. When I re-added this newline at the end of the fastq before compressing, the problem was gone.
I will change my own code, but I also think it might be good for snap that it also accepts fastq.gz-files without a newline on the end of the fastq.
Keep up the good work! Klaas Mensaert