bergmanlab / ngs_te_mapper2

Software for detecting transposable element insertions from next-generation sequencing data
BSD 2-Clause "Simplified" License
9 stars 1 forks source link

Empty results #2

Closed jsitarka closed 3 years ago

jsitarka commented 3 years ago

Hi team,

I am using your program in my master thesis development on Arabidopsis genome and I always get empty both .ref.bed and nonref.bed files. Is there any desired format of short reads to get a valid result?

I run command: conda activate ngs_te_mapper2 python3 /path/to/source/code/ngs_te_mapper2/sourceCode/ngs_te_mapper2.py -o output -f SRR8397878_1.fastq -r TAIR10_chr_all.fasta -l RepeatMaskerPlants_nospace.fasta

Reads were downloaded from: https://www.ebi.ac.uk/ena/browser/view/SRR8397878

Reference fasta was downloaded from: https://www.arabidopsis.org/download/index-auto.jsp?dir=%2Fdownload_files%2FGenes%2FTAIR10_genome_release%2FTAIR10_chromosome_files

I used library using in RepeatMasker (but I had to remove spaces and special characters like \,? etc. from id description - because there were some problems during intermediates directories creation e.g. in case there was the line ">Gypsy-17 LBS-I LTR Gypsy" corresponding folder name was only "Gypsy-17" but the program expected full name)

The library has a structure like this:

Gypsy-17_LBS-I_LTR_Gypsy aaggtggacactgtgggaaccaacagcaacctggccggcgtaacagcaga ... BEL-154_AA-LTR_LTR_Pao tgtctacgaccaacaaaacctacttatccctcattactctactggtgcaa ... Copia-1_DYa-I_LTR_Copia ataggttatgggcccaggagtagtaaagactttaataattgtgtgtgatc ...

Message from log file: 04/23/2021 23:31:16: INFO: CMD: ../ngs_te_mapper2/sourceCode/ngs_te_mapper2.py -o output -f SRR8397878_1.fastq -r TAIR10_chr_all.fasta -l RepeatMaskerPlants_nospace.fasta 04/23/2021 23:31:16: INFO: Parsing input files... 04/24/2021 09:52:42: INFO: Start alignment... 04/24/2021 11:49:47: INFO: Alignment finished in 1 hours 57 minutes 4 seconds 04/24/2021 11:49:47: INFO: Detecting insertions... 04/24/2021 14:42:50: INFO: Insertion candidate search finished in 2 hours 53 minutes 2 seconds 04/24/2021 14:43:07: INFO: ngs_te_mapper finished in 15 hours 11 minutes 49 seconds 04/24/2021 14:43:07: INFO: Number of reference TEs: 0 04/24/2021 14:43:07: INFO: Number of non-reference TEs: 0

Am I doing something wrong? Thanks a lot for your advice :)

shunhuahan commented 3 years ago

Best, Shunhua

shunhuahan commented 3 years ago

Hi @jsitarka,

Shunhua

shunhuahan commented 3 years ago