odomlab2 / sci-rocket

Snakemake workflow for (pre-)processing sci-RNA-seq3 data
MIT License
3 stars 3 forks source link

Cached GTF or temp files? #30

Closed gauravvaidya16 closed 7 months ago

gauravvaidya16 commented 7 months ago

Hi Job,

I'm currently using sci-rocket for our non-model planarian species, utilizing a GTF file with 81k genes categorized into various types, including protein-coding, lncRNA, and rRNA. Among these, we have approximately 24k protein-coding genes.

Initially, I ran sci-rocket on the unedited GTF file, resulting in a CDS object with 81k features. To address this, I filtered the GTF file to include only the protein-coding genes. However, despite this adjustment, the resulting CDS object consistently contains 81k features instead of the expected 24k.

I'm wondering whether the pipeline might be caching the GTF file or utilizing temporary files, potentially causing it to revert to the original unfiltered version. What do you think?

Thanks, Gaurav