Closed mossconfuse closed 1 year ago
You can try agat_sq_list_attributes.pl
, to list the attributes then agat_sp_manage_attributes.pl
to remove the attributes you do not want. The poblem with the second script (sp prefix) must keep the ID parent attributes. You will be forbidden to remove it. Keep a try and see if it works. Otherwise I will implement something like agat_sq_filter_attributes.pl
Thank you, Juke34, I could remove attributes with your advice.
Hi
I get recurring errors from cellranger-arc regarding my agat generated gtf files. I think the majority of this stems from the formatting of the attributes column. 10X gives recommendations, though not a detailed solution to fix this
"In all of the above cases, the reasons range from either duplicate/missing features or poorly formatted entries. To troubleshoot such issues, the following steps can be implemented using custom scripts:
Recommended to retain only gene_id, transcript_ids, and gene_name attributes. Verify for any redundancy and order genes in the annotation file Replace or remove the gene_ids that have empty values. Duplicate transcript_ids for multiple gene_id must be converted as unique (eg: unknown_transcript_1 fields)"
Could you please add a feature that would allow users to select which attributes are kept? It is easy enough to keep or discard rows that get ignored by cellranger using grep, but fixing the attributes column get harder when different gtf files get combined and the order of attributes are not consistent.
A feature like
agat_keep_attr.pl "gene_id" "transcript_id" "gene_name"
would be great.Thanks