bluenote-1577 / sylph

ultrafast taxonomic profiling and genome querying for metagenomic samples by abundance-corrected minhash.
MIT License
194 stars 7 forks source link

Sketch error: different number of paired sequences #10

Closed naturepoker closed 6 months ago

naturepoker commented 6 months ago

Hi,

I've been trying to run the sylph sketch command on a forward and reverse short read set using command:

sylph sketch -1 4914_4_cat_R*

The reads files are in form of:

4914_4_cat_R1.fastq.gz 4914_4_cat_R2.fastq.gz

The command fails with output:

2024-05-08T18:01:26.964Z ERROR [sylph::sketch] Different number of paired sequences. Exiting.

I double checked the short read sets, and the number of sequences in the R1 and R2 fastq files are the same - 4,078,148 as confirmed by both seqkit stats and manual awk screening of the read files.

Any clue on what I could do here would be appreciated.

Thank you!

bluenote-1577 commented 6 months ago

hi @naturepoker,

You should run the command sylph sketch -1 4914_4_cat_R1.fastq.gz -2 4914_4_cat_R2.fastq.gz instead -- both the -1 and -2 options are needed. Let me know if it works for you.

Jim

naturepoker commented 6 months ago

Ah, of course. Now I feel silly. I was reading through the mouse reads example in the taxonomic profiling tutorial page, and through -1 and -2 implied separate paired reads per number.

I tried the command as you proposed and it works perfectly. Closing the issue.

Thank you!