NBISweden / AGAT

Another Gtf/Gff Analysis Toolkit
GNU General Public License v3.0
467 stars 56 forks source link

agat_sp_complement_annotations.pl couldn't identify the same l1 feature #406

Closed ChihYingLu closed 1 year ago

ChihYingLu commented 1 year ago

Hi,

Thank you for developing the tool. I am quite new to this field so this might be a naive question, please help! I am trying to add gene models from one annotation (from reefgenomics.org, contains more l1 features) to another (from NCBI). However, the agat_sp_complement_annotations.pl function could not identify the l1 feature that shares the same CDS. For example:

in --ref ncbi_ref.gff: Picture1

in --add reefgenomics.gff3: 螢幕擷取畫面 2023-10-22 124712

For SpisGene6, the CDS have the same overlaps between the two GFF files (same l1 feature). However, when I used the agat_sp_complement_annotations.pl tools, it determines the two are different and adds the feature to the reference annotation (Actually, all the l1 features in the two GFF files are being identified as different).

I noticed the contig names of the two GFF files are different. Could this be the reason why the l1 features with the same CDS are being identified as different? Thank you very much.

Juke34 commented 1 year ago

If you want to merge overlapping level1 features you must use ‘agat_sp_merge_annotations.p’ script instead. But yes the sequence name must be the same for the features. The merge is not based on their name but rather their location.

ChihYingLu commented 1 year ago

Hi Juke34,

Thank you for the fast response! I ran the "merge" function but still got a similar result. Then, I tried to fix the sequence name problem. After making sure that the sequence names between the two files are the same, I run the "merge" function again. This time the result is successful! thank you very much for your help!