frattalab / PAPA

PAPA (Pipeline-Alternative Polyadenylation) - Snakemake pipeline for analysis of APA from short-read RNA-seq data
GNU General Public License v3.0
1 stars 0 forks source link

Script to filter GTF file based on attributes #29

Closed SamBryce-Smith closed 2 years ago

SamBryce-Smith commented 2 years ago

PR adds filter_gtf.py, which provides functionality to filter a reference GTF file based on attribute values (e.g. 'gene type', 'tag values'). The functions themselves are flexible and handle all comparison types (e.g. membership, equality, string contains etc.), but the command line interface is fairly restrictive. Currently only '1-level' filtering is possible via CLI (e.g. can't filter for transcript types of certain gene biotypes, even though functions are capable of handling this).

The script could be generalised better but not necessary for function in pipeline.

Added default options to config.yaml but this has not been hooked up to pipeline yet.