W-L / deviaTE

Python tool for the analysis and visualization of mobile genetic elements
GNU General Public License v3.0
19 stars 7 forks source link

Idea for walkthrough: additional information on how to process paired reads #9

Closed AnnaMariaL closed 2 years ago

AnnaMariaL commented 2 years ago

Hi Lukas,

I have a minor comment:

I use DeviaTE to estimate TE copy numbers on already mapped bam files. These bam files consist of paired reads that have been mapped in single read mode, but the input for the mapping algorithm was a concatenated .fq file with both read pairs (read1 and read2). Thus, the same read name occurs twice in these files.

For some of these files, DeviaTE works just fine while for others, I get an error message similar to what has been reported here in the issues sections:

_... line 70, in fam_strand = seg.referencename + '+' TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'

I realized that when renaming the reads before mapping (read_1, read_2, ..., read_n) - this resolves the issue. Maybe this would be good to point out in the walkthrough? I can imagine that some users do have paired read data (e.g., for polymorphism scans) that are also used for TE analysis with DeviaTE.

Best, Anna

W-L commented 2 years ago

Hi Anna, Thanks for getting in touch and for the helpful suggestion! I will add a note in the walkthrough, and I also uploaded a little script to give fastq reads unique names, in case that might be useful for someone (example/rename_reads.py). I should probably look into the cause of the problem and fix that instead at some point.

Hope everything is going well, please let me know if there's anything else I can help with! Lukas

W-L commented 2 years ago

740061d7c93987440cf5965f85465359e87a3ea1 e9e9cc3e575f9ff79e7ccc0ba91946b3dba82ae4