assign_tx_to_pas.py - script to assign transcripts to shared poly(A) site or last exon isoforms (closes #24)
tx_to_polya_quant.R - script taking PAS / last exon assignments & Salmon quantifications to summarise transcript expression to polyA site expression. Separate conda environment for R dependencies is also provided (envs/papa_r.yaml)
(Hackily) closes #16 - rows with undefined strand ('+' or '-') are removed from GTF prior to input to filter_tx_by_intron_chain.py
All changes from update_filtering_plumbing branch - separate .smk file for filtering steps, specify 'condition' in sample table plus other minor changes
gene_names_from_tracking.py - script to extract 'reference gene names' for novel IDs from GFFcompare '.tracking' files - i.e. reference gene contributing to merged gene locus containing novel isoforms. Note this is not currently hooked up to the pipeline.
This PR merges in following enhancements:
assign_tx_to_pas.py
- script to assign transcripts to shared poly(A) site or last exon isoforms (closes #24)tx_to_polya_quant.R
- script taking PAS / last exon assignments & Salmon quantifications to summarise transcript expression to polyA site expression. Separate conda environment for R dependencies is also provided (envs/papa_r.yaml
)filter_tx_by_intron_chain.py
update_filtering_plumbing
branch - separate .smk file for filtering steps, specify 'condition' in sample table plus other minor changesgene_names_from_tracking.py
- script to extract 'reference gene names' for novel IDs from GFFcompare '.tracking' files - i.e. reference gene contributing to merged gene locus containing novel isoforms. Note this is not currently hooked up to the pipeline.