Open pansapiens opened 4 years ago
https://github.com/MonashBioinformaticsPlatform/RNAsik-pipe/issues/45 should probably take priority here, since it would enable explicitly working around any of these types of pair detection issues.
Still occurs in RNAsik 1.5.4.
eg, input files
sampleA_1-WT-V-A_1.fq.gz
andsampleA_1-WT-V-A_2.fq.gz
with flags-paired -pairIds _1,_2 -extn .fq.gz
fails with an error:This is because when converting the
_1
filename into the_2
filename to verify that paired files exist,string.replace
is used but the substring_1
occurs twice in the first read pair.One solution is to enforce that the pairId must be immediately before the extension, like this: https://github.com/pansapiens/RNAsik-pipe/commit/914f0297b578c5a6a20c37820d6a2688833f7117
(the side effect of this patch would be that typical Illumina instrument output eg
somereads_R1_001.fastq.gz
andsomereads_R2_001.fastq.gz
you'd probably need to specify-paired -pairIds _R1_001, R2_001 -extn .fastq.gz
, or maybe-paired -pairIds _R1, R2 -extn _001.fastq.gz
- untested).