cnobles / iGUIDE

Bioinformatic pipeline for identifying dsDNA breaks by marker based incorporation, such as breaks induced by designer nucleases like Cas9.
https://iguide.readthedocs.io/en/latest/
GNU General Public License v3.0
20 stars 9 forks source link

Processing fails for SRA read IDs #43

Closed ressy closed 5 years ago

ressy commented 5 years ago

The FASTQ read identifiers supplied by the SRA look like @SRR#######.#, but the default readNamePattern used in the demulti and filt rules just takes the portion before the period which leads to errors due to non-unique read IDs. Since the R scripts allow custom patterns it looks like exposing the pattern argument to the rules would allow this to work.

cnobles commented 5 years ago

Fixed in #50! Config parameter (readNamePattern) can be used with regex to change the readNamePattern for several parts of the pipeline.