elzbth / jitterbug

Jitterbug is a bioinformatic software that predicts insertion sites of transposable elements in a sample sequenced by short paired-end reads with respect to an assembled reference.
17 stars 8 forks source link

Problem with Jitterbug filtering #20

Open brguez opened 4 years ago

brguez commented 4 years ago

Dear developers,

We are testing Jitterbug in a deeply sequenced sample (300x). Apparently the execution was ok as all the output files were generated. However, most of the insertions are lost in the filtering step (only 53/90295 pass the filtering).

Do you have any clue why this is happening? we would expect ~1200 germline MEI in this sample.

Here are the commands we've used:

jitterbug.py --numCPUs 10 --mem --output_prefix "$output_dir"/"$sample_name" "$sample_bam" ./data/hg19.te_annots.gff

jitterbug_filter_results_func.py -g "$output_dir"/"$sample_name".TE_insertions_paired_clusters.gff3 -c "$output_dir"/"$sample_name".filter_config.txt -o "$output_dir"/"$sample_name".TE_insertions_paired_clusters.filtered.gff3

This is the log for the filtering step: Total rows = 90295 Passed rows =53

Description of the non passed rows: Problems with softclip support: 87780 Wrong interval size: 25129 Wrong span size: 11744 Inconsistency of TE family FWD and REV: 1242 Wrong cluster size: 0

Thanks! Berni

CPTPaso commented 3 years ago

Same problem on my end. Could you somehow solve this issue?