LooseLab / readfish

CLI tool for flexible and fast adaptive sampling on ONT sequencers
https://looselab.github.io/readfish/
GNU General Public License v3.0
163 stars 31 forks source link

readfish validate reference file name check #326

Closed speleonut closed 1 month ago

speleonut commented 5 months ago

Hi Matt & Alex, There is a potential bug in the reference file check which causes the following error message: "readfish.validate Provided index file appears to be of an incorrect type - should be one of ['.fasta', '.fna', '.fsa', '.fa', '.fastq', '.fq', '.fasta.gz', '.fna.gz', '.fsa.gz', '.fa.gz', '.fastq.gz', '.fq.gz', '.mmi']" If the filename has a "." then the reference will be marked as the incorrect type, even if it is correct: e.g. GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.mmi causes the error GCA_000001405_15_GRCh38_no_alt_analysis_set.mmi does not. A simple work around is to create a soft link with an acceptable file name.
Perhaps in src/readfish/plugins/_mappy.py (lines 90 - 96) testing the suffixes only at position [-1] first, then if it fails testing only suffixes at positions [-2:] for any given index file would solve this problem. My python skills are limited or I would just suggest a patch... Cheers, Mark

github-actions[bot] commented 5 months ago

Thank you for your issue. Give us a little time to review it.

PS. You might want to check the FAQ if you haven't done so already.

This is an automated reply, generated by FAQtory

mattloose commented 5 months ago

Thanks for highlighting this - it is a good point and we will address in a futre release.

Adoni5 commented 1 month ago

Fixed in #330