snakemake-workflows / rna-seq-kallisto-sleuth

A Snakemake workflow for differential expression analysis of RNA-seq data with Kallisto and Sleuth.
MIT License
66 stars 44 forks source link

fix: add more logging statemens to `sleuth-diffexp.R`, add QuantSeq testing data to get QuantSeq tests to pass #86

Closed dlaehnemann closed 9 months ago

dlaehnemann commented 9 months ago

The underlying problem that we identified with the work on this debug-vroom branch was a malformatted custom file for specifying canonical transcript to use in sleuth-diffexp.R. The take-away message here were:

  1. The sleuth-diffexp.R script with its large write_results() function was hard to debug, and to ease the burden a bit in the future, we should probably keep the extra logging statements we included.
  2. The datavzrd/diffexp-template.yaml does not seem to play nice with custom canonical transcript files. The canonical column does not make sense in the genes_aggregated results table (as this should only contain gene names / identifiers, and no transcript identifiers, and only the latter could be canonical or not), so we remove it there. However, how to have a canonical column in the genes_representative case with a custom canonical transcript file still needs to be solved before merging this PR.
dlaehnemann commented 9 months ago

Maybe we should also refactor the write_results() function quite a bit. This is a big monolith, that can probably do with a bit of cleanup. But I'll first add some proper QuantSeq testing data to the fix-canonical... branch to get the test suite passing again.