Mayrlab / scUTRquant

Bioinformatics pipeline for single-cell 3' UTR isoform quantification
https://Mayrlab.github.io/scUTRquant
GNU General Public License v3.0
15 stars 4 forks source link

Add HDF5 support #56

Closed mfansler closed 2 years ago

mfansler commented 2 years ago

Large Datasets

We add a new boolean option, use_hdf5, to output a HDF5-backed SingleCellExperiment object. This mode is disabled by default, and can be enabled by adding use_hdf5: True to a configuration YAML file.

In this mode, the pipeline will emit both an .rds and a .h5 file to the data/sce/{target}/ folder. Both files must be colocated in order to load the object. The HDF5Array::loadHDF5SummarizedExperiment must be used to load the SingleCellExperiment object. For example, given a {dataset_name} and {target}, one would load a txs output with:

library(HDF5Array)
library(SingleCellExperiment)

sce_txs <- loadHDF5SummarizedExperiment("data/sce/{target}", "dataset_name.txs.")

Please note the trailing . in the prefix argument.

closes #55

Optional Reporting

We add a new boolean option, include_reports, which is on by default. Add include_reports: False to a configuration YAML file to disable R Markdown reporting.

closes #14

Upgrading to Bioconductor 3.14

The YAML environment definitions have been updated to use R 4.1 and Bioconductor 3.14 packages.

closes #31