frattalab / PAPA

PAPA (Pipeline-Alternative Polyadenylation) - Snakemake pipeline for analysis of APA from short-read RNA-seq data
GNU General Public License v3.0
1 stars 0 forks source link

Make sure gene_name is propagated throughout novel last exons GTFs #30

Closed SamBryce-Smith closed 1 year ago

SamBryce-Smith commented 2 years ago

This probably means modifying read_gtf_specific so it also extracts 'gene_name' (by default)

SamBryce-Smith commented 1 year ago

read_gtf_specific already extracts the gene name by default. The problem is in get_novel_last_exons.py, which removes the gene_name attribute from the reference PyRanges object when trying to reduce the number of columns. Simple fix is to add 'gene_name' to these lists.

SamBryce-Smith commented 1 year ago

Resolved as of 017e851. Was a little more involved than expected...