NBISweden / AGAT

Another Gtf/Gff Analysis Toolkit
GNU General Public License v3.0
465 stars 56 forks source link

the Script ‘agat_convert_sp_gxf2gxf.pl’ has no the parameter '--ct',but the docs showed the '--ct' #306

Closed z626093820 closed 1 year ago

z626093820 commented 1 year ago

image could you help me? thank you!

Juke34 commented 1 year ago

Hi, Right since v1.0.0 AGAT uses a config file.

agat config --expose

Then add the common tag you want at the proper field in the config.yaml file.

z626093820 commented 1 year ago

image if i want to slove above problem about 'feature type missing and no Parent/gene_id',how can i change the below config file? i am not clear. thank you ! image

z626093820 commented 1 year ago

image this is my gff3 file,thank you !

Juke34 commented 1 year ago

It sounds that this type of file does not clear feature to group several mRNA under a unique gene. In your case you can use the common_tag/locus_tag parameter (I need to update the help to unify te use of those terms) Name:

locus_tag:
   - locus_tag
   - gene_id
   - Name

It should work but I guess mRNA Ibat.Brg.01A_G000020.1 and Ibat.Brg.01A_G000020.3 are supposed to be isoforms and would have the same parent gene feature Ibat.Brg.01A_G000020, right? In that case it would be trickier. You need to provide an information to AGAT how to link those features together:

# First duplicate the Name attribute of the mRNA into a Parent attribute:
 agat_sq_manage_attributes.pl -gff file.gff  -att ID/Parent -p mRNA --cp --overwrite -o output.gff

# Then find a way to clip the .1 .2 .3 suffix of the Parent attribute of the mRNA feature using e.g. an awk command

# Then you are ready to use the any AGAT script 
z626093820 commented 1 year ago

yes,the Ibat.Brg.01A_G000020 is the parent of Ibat.Brg.01A_G000020.1 and Ibat.Brg.01A_G000020.3,but the gff file has not the parent Ibat.Brg.01A_G000020,so i want to add Ibat.Brg.01A_G000020. thank you ! i will try!

z626093820 commented 1 year ago

Hello, I have another question, this script ‘agat_convert_sp_gxf2gxf.pl’ merges a part of the CDS and UTR located on different positive and negative chains, is this correct?Is it not necessary to consider positive and negative chains? i do not clear. THANK YOU! image

Juke34 commented 1 year ago

Hmm, when applying this merging step AGAT works only transcript by transcript, it means that within a single transcript there are subfeatures (CDS and UTR) with a mix of strand. This is not supposed to happen (mix of strand within a transcript), so maybe the previous step didn't operate as expected. Transcripts from different genes (with different strand) have been merged because share the same Parent (previously Name without suffix)