umccr / RNAsum

Pipeline for generating RNAseq-based cancer patient reports
https://umccr.github.io/RNAsum/
Other
7 stars 4 forks source link

Keep reference data in separate R data package #108

Closed pdiakumis closed 1 year ago

pdiakumis commented 1 year ago

Since the reference data is ~500MB and does not change frequently, we can consider keeping it in a separate R pkg, e.g. {RNAsum_data}. We would then use it as usual, e.g.:

system.file("rawdata/test_data/dragen/arriba/fusions.tsv", package = "RNAsum_data")

This would ease the CI/CD process too, since each RNAsum code update would not need to pull the entire 500MB dataset into the tarball.