frattalab / PAPA

PAPA (Pipeline-Alternative Polyadenylation) - Snakemake pipeline for analysis of APA from short-read RNA-seq data
GNU General Public License v3.0
1 stars 0 forks source link

filter_tx_by_intron_chain.py - Consider groupby.rank() for adding intron_numbers by transcript #9

Closed SamBryce-Smith closed 3 years ago

SamBryce-Smith commented 3 years ago

https://stackoverflow.com/questions/37997668/pandas-number-rows-within-group-in-increasing-order

https://stackoverflow.com/questions/33899369/ranking-order-per-group-in-pandas

May be quicker than my custom apply approach - currently adding intron numbers takes ~ 4 mins for reference_introns, ~ 2 mins for novel introns

SamBryce-Smith commented 3 years ago

Now adds intron numbers to ref & novel in about 10 s - massive speed improvement!

Closed with e740b19c0160454abe01554b3ccd6d2d4225b4f9