Magdoll / cDNA_Cupcake

Miscellaneous collection of Python and R scripts for processing Iso-Seq data
BSD 3-Clause Clear License
257 stars 104 forks source link

cluster id cleaned in get_abundance_post_collapse #233

Closed ssutharzan closed 2 years ago

ssutharzan commented 2 years ago

Fixes #229 .

Change proposed in this request: Clean cluster ids from cluster_report.csv by removing all the characters before transcript/ .

Testing: Tested using the data in test_data folder. test.collapsed.abundance.txt will obtained from this change was identical to the test.collapsed.abundance.txt obtained using cluster_report.csv cleaned by sed prior to be used with the current version of the script. The results in the test_data folder seems to be an older version and do not exactly match the output of the current version of the script.

Contribution: I am making this pull request as part of a work study at Cincinnati Children's Hospital Medical Center (CCHMC) © CCHMC 2022

@Magdoll Thank you for sharing these wonderful scripts!