NBISweden / AGAT

Another Gtf/Gff Analysis Toolkit
GNU General Public License v3.0
465 stars 56 forks source link

exon and UTR checks without fasta #361

Closed alexvasilikop closed 1 year ago

alexvasilikop commented 1 year ago

Describe the bug Hello I have a question concerning the checks performed by AGAT without providing a fasta. I am using agat version 0.9.2.

General (please complete the following information): Installation is with anaconda

To Reproduce Specifically I am trying to remove some mRNAs with blast hit to TEs from a newly annotated genome. So I provided a list to AGAT with the ids to remove

agat_sp_filter_feature_from_kill_list.pl -f species.gff3 --kl IDS_toremove.txt -o species.filtered_TEs.gff3

It seems that agat is also doing some checks and fixing some UTRs and exons. How is able to do it without a fasta? What kind of fixes does it perform in this case?:

----------------------------- Check9: check exons ------------------------------
No exons created
940 exons locations modified that were wrong
No supernumerary exons removed
No level2 locations modified
------------------------------ done in 5 seconds -------------------------------

----------------------------- Check10: check utrs ------------------------------
154 UTRs created that were missing
805 UTRs locations modified that were wrong
No supernumerary UTRs removed
------------------------------ done in 5 seconds -------------------------------

------------------------ Check11: all level2 locations -------------------------
No problem found
------------------------------ done in 7 seconds -------------------------------

------------------------ Check12: all level1 locations -------------------------
No problem found
------------------------------ done in 0 seconds -------------------------------

---------------------- Check13: remove identical isoforms ----------------------
Lets remove isoform FUN_013069-T2
Lets remove isoform FUN_026356-T2
2 identical isoforms removed

...

Many thanks Alex

Juke34 commented 1 year ago

Some of these behaviors can be sopped via config file with AGAT >= v1.0.0 The check: If exons are overlapping or adjacent (by mRNA/isoform) they are merged together. If CDS/UTR/stop_codon... (feature type that are expected in exon see level3 in feature_levels.yaml) are described but are not included in a corresponding exon, existing exons are extended or new exons are created. Similar thing occur for UTR. For UTR creation is slightly different, It can be deduce only if we have exon and CDS features in the file. Based on that information AGAT will deduce the UTR. If missing it create them, if already existing it will veryfy the locations.

You could look at the log file to get more information about what has been done by AGAT.

alexvasilikop commented 1 year ago

Great thanks,

Using a config file can I stop these modifications? How?

thanks

Juke34 commented 1 year ago

agat config --expose Then you modify true by false like here https://github.com/NBISweden/AGAT/blob/3107f4d2cd60f469bcc9a1b8151219f0aa20491d/share/config.yaml#LL69C1-L69C18

alexvasilikop commented 1 year ago

great thanks