amplab / snap

Scalable Nucleotide Alignment Program -- a fast and accurate read aligner for high-throughput sequencing data
https://www.microsoft.com/en-us/research/project/snap/
Apache License 2.0
287 stars 66 forks source link

The two FASTQ files are not the same size! #134

Closed tylerjkennedy closed 2 years ago

tylerjkennedy commented 2 years ago

Hello,

I am trying to use SNAP on my PE seq data after trimming for adapters and base quality. I'm running into an error about the file sizes:

Welcome to SNAP version 0.15.4. Loading index from directory... 2s. 100286401 bases, seed size 20 ConfDif MaxHits MaxDist MaxSeed ConfAd %Used %Unique %Multi %!Found %Error Reads/s Traceback (most recent call last): File "/Users/millerlab/opt/miniconda3/bin/mimodd", line 7, in main.parse() File "/Users/millerlab/opt/miniconda3/lib/python3.7/site-packages/MiModD/main.py", line 1200, in parse result = f(result) File "/Users/millerlab/opt/miniconda3/lib/python3.7/site-packages/MiModD/tmpfiles.py", line 26, in catch_sigterm_wrapper ret = f(args, kwargs) File "/Users/millerlab/opt/miniconda3/lib/python3.7/site-packages/MiModD/snap.py", line 843, in snap_batch snap_call(job_args) File "/Users/millerlab/opt/miniconda3/lib/python3.7/site-packages/MiModD/tmpfiles.py", line 26, in catch_sigterm_wrapper ret = f(args, **kwargs) File "/Users/millerlab/opt/miniconda3/lib/python3.7/site-packages/MiModD/snap.py", line 340, in snap_call raise RuntimeError(msg) RuntimeError: SNAP failed with: The two FASTQ files are not the same size! Make sure they contain the same reads in the same order, without comments. DEVTEAM: Allowing run, but you need to run on only one thread because the file splitter Doesn't understand how to deal with files that don't match byte-for-byte. PairedFASTQReader: failed to read mate. The FASTQ files may not match properly.

I've tried this after trimming with fastp QC and trimmomatic. If I take the base quality steps out of the trimmomatic input, then I can run SNAP on the output files just fine, so I think the error has to do with the quality trimming rather than the adapter trimming.

I checked that the number of reads between my PE sequencing files are the same after running fastp QC using:

echo $(zcat < filename.fastq.gz | wc -l)/4|bc

And the number of lines match between the two paired files.

Any idea on how I can correct the trimmed files so that SNAP will accept them?

Thank you,

Tyler

bolosky commented 2 years ago

You're running a really prehistoric version of SNAP. Get the current 1.0.4 version and you shouldn't have a problem.

You can find it on our website: SNAP - Scalable Nucleotide Alignment Program - Microsoft Researchhttps://www.microsoft.com/en-us/research/project/snap/

--Bill

From: tylerjkennedy @.> Sent: Wednesday, September 8, 2021 7:23 PM To: amplab/snap @.> Cc: Subscribed @.***> Subject: [amplab/snap] The two FASTQ files are not the same size! (#134)

Hello,

I am trying to use SNAP on my PE seq data after trimming for adapters and base quality. I'm running into an error about the file sizes:

Welcome to SNAP version 0.15.4. Loading index from directory... 2s. 100286401 bases, seed size 20 ConfDif MaxHits MaxDist MaxSeed ConfAd %Used %Unique %Multi %!Found %Error Reads/s Traceback (most recent call last): File "/Users/millerlab/opt/miniconda3/bin/mimodd", line 7, in main.parse() File "/Users/millerlab/opt/miniconda3/lib/python3.7/site-packages/MiModD/main.py", line 1200, in parse result = f(result) File "/Users/millerlab/opt/miniconda3/lib/python3.7/site-packages/MiModD/tmpfiles.py", line 26, in catch_sigterm_wrapper ret = f(args, kwargs) File "/Users/millerlab/opt/miniconda3/lib/python3.7/site-packages/MiModD/snap.py", line 843, in snap_batch snap_call(job_args) File "/Users/millerlab/opt/miniconda3/lib/python3.7/site-packages/MiModD/tmpfiles.py", line 26, in catch_sigterm_wrapper ret = f(args, **kwargs) File "/Users/millerlab/opt/miniconda3/lib/python3.7/site-packages/MiModD/snap.py", line 340, in snap_call raise RuntimeError(msg) RuntimeError: SNAP failed with: The two FASTQ files are not the same size! Make sure they contain the same reads in the same order, without comments. DEVTEAM: Allowing run, but you need to run on only one thread because the file splitter Doesn't understand how to deal with files that don't match byte-for-byte. PairedFASTQReader: failed to read mate. The FASTQ files may not match properly.

I've tried this after trimming with fastp QC and trimmomatic. If I take the base quality steps out of the trimmomatic input, then I can run SNAP on the output files just fine, so I think the error has to do with the quality trimming rather than the adapter trimming.

I checked that the number of reads between my PE sequencing files are the same after running fastp QC using:

echo $(zcat < filename.fastq.gz | wc -l)/4|bc

And the number of lines match between the two paired files.

Any idea on how I can correct the trimmed files so that SNAP will accept them?

Thank you,

Tyler

- You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Famplab%2Fsnap%2Fissues%2F134&data=04%7C01%7Cbolosky%40microsoft.com%7C4ea7b4d4651e493392b808d97338c836%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637667509989973680%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=I6BwHyeZQJvKjk%2BQbqWYUujaeLMfsOuJcHNgr23lO1E%3D&reserved=0, or unsubscribehttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAHPTWNRYU6XKSMKLGC5YW3UBALBHANCNFSM5DWC2HWQ&data=04%7C01%7Cbolosky%40microsoft.com%7C4ea7b4d4651e493392b808d97338c836%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637667509989973680%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=MzfqosKOd1NlILLLtVljvTQdC3kaIl%2FTSIaMWCInWvw%3D&reserved=0. Triage notifications on the go with GitHub Mobile for iOShttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7Cbolosky%40microsoft.com%7C4ea7b4d4651e493392b808d97338c836%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637667509989983678%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=u7cSNsA%2FFGfXju2md0Tmqz%2FtN7Na8Siv0vwrXEkojUA%3D&reserved=0 or Androidhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26referrer%3Dutm_campaign%253Dnotification-email%2526utm_medium%253Demail%2526utm_source%253Dgithub&data=04%7C01%7Cbolosky%40microsoft.com%7C4ea7b4d4651e493392b808d97338c836%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637667509989993674%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Q%2BlNbER%2FMBgTa%2BE0LzdIN0an58Wxc5Uw9pKSRKVWg1I%3D&reserved=0.

tylerjkennedy commented 2 years ago

Ah, thank you. I am loading SNAP through another program (MiModD) and I guess that it is set to use the older version. I'll get the latest version of SNAP and run that separately.

Thank you for the help,

Tyler