elzbth / jitterbug

Jitterbug is a bioinformatic software that predicts insertion sites of transposable elements in a sample sequenced by short paired-end reads with respect to an assembled reference.
17 stars 8 forks source link

Interperting the results #23

Open PotapenkoEugene opened 2 years ago

PotapenkoEugene commented 2 years ago

Hi!

Thank you for such useful tool! Can you help me with interpreting the results?

This is the one line of my gff3 output:

chr5H jitterbug TE_insertion 443642706 443642794 . . . supporting_fwd_reads=7; supporting_rev_reads=6; cluster_pair_ID=7690; lib=None; Inserted_TE_tags_fwd=RLX_unknown_HvMo_chr_4H-37161, RLG_unknown_HvMo_chr_4H-97079, RLX_unknown_HvMo_chr_2H-166486, RLX_unknown_HvMo_chr_7H-165987, RLX_unknown_HvMo_chr_0-7571, RLX_unknown_HvMo_chr_4H-84096; Inserted_TE_tags_rev=RLX_unknown_HvMo_chr_7H-124479, RLX_unknown_HvMo_chr_5H-52963, RLX_unknown_HvMo_chr_6H-54046, RLG_Sukkula_HvMo_chr_2H-199, RLX_unknown_HvMo_chr_1H-54792, RLG_unknown_HvMo_chr_2H-74651, RLX_unknown_HvMo_chr_7H-42435; fwd_cluster_span=33; rev_cluster_span=61; softclipped_pos=(-1, -1); softclipped_support=0; het_core_reads=-1; zygosity=-1.000

1) In documentation and article I couldn't find the meaning of negative zygosity. Could this be a bug?

2) In my work I need to compare ~300 genomes by copy number variation of TEs. How I can parse the gff file for this task? How to interpret the many variants of the annotation listed separated by commas?

3) How can I take into account TEs present in the reference but absent in my samples?

I will be extremely grateful for any help! Thank you again for the tool!