Rmd refactor (part 10) - Githubissues

Okay, a few more changes below. I'm gonna merge this one to get the docker/conda deployment working, but please let me know if these changes don't make sense @skanwal!

Devtools

Multiple small changes to satisfy devtools crancheck (e.g. .data$ and var quoting)

Fusions

Sort table by *dna_support, reported_fusion, fusion_caller

CLI

I've gone ahead and changed the following options to be flags that you need to specify in order to enable:
- --batch_rm
- --dataset_name_incl
- --drugs
- --filter
- --immunogram
- --log
- --pcgr_splice_vars
- --save_tables This means you need to be explicit for those! If we want to have some of those enabled by default, we should instead create negating options e.g. --nofilter, --nolog etc.
Directory inputs: handle Dragen WTS and Arriba directories. I've tested this locally but would need a bit more testing:
- dragen_wts_dir: if this is specified and any of salmon, fusions, or mapmetrics are not specified, do a list.files in that directory to match the specific file patterns. If no matches, those params become NULL (e.g. dragen_fusions becomes NULL). Note that for the salmon counts we're looking for the quant.genes.sf file, not the quant.sf one.
- arriba_dir: same. If you specify arriba_dir and don't specify arriba_tsv or arriba_pdf, it constructs the paths to those as file.path(arriba_dir, "fusions.[tsv/pdf]")

Structural variants

The sv_prioritize function in prod has a bug where it reads in the SR column that has split read support for ref and alt separated by , in a numeric column ignoring the ,, so that if you have an SR of 20,15, that gets read in as 2015. This has downstream ramifications when column splitting happens so the SR alt becomes NA for all variants. Same thing happens for PR alt. This happens because there are no col_classes specified when using readr::read_tsv, so readr does its best to infer what on earth the columns are, but here its guess was erroneous. But. In a fortunate turn of events, the only impact this has on the SV table is that the SR/PR columns are blank, and the BND/SR filter applied at https://github.com/umccr/RNAsum/blob/master/rmd_files/RNAseq_report.Rmd#L1638 does not filter out anything since SR is NA for all rows there. And I'm actually okay with that, since that filter has been changed in umccrise itself. So all good!
I've gone ahead and modified that function in dev to read in columns with explicit classes, and I've completely removed that filter. So numbers stack up now. I've also handled the tidyr::unnest(annotation) warning we've been getting since forever.

Conda

I've pinned dependencies to match the current ones in prod, main ones are:
- base: 4.1.3
- edgeR: 3.36.0
- limma: 3.50.1
manhattanly didn't have a conda pkg for R v4.1 so I created one at https://anaconda.org/umccr/r-manhattanly using the recipe at https://github.com/umccr/conda_recipes/tree/main/r-manhattanly
I also released a RNAsum.data v0.0.3 for R v4.1 (same as v0.0.2, but that was build for R v4.2). Todo would be to have that built for R v4.1/4.2/4.3 in parallel. It's okay for now.

Rmd other

Added arriba_dir and dragen_wts_dir params.
The scaling param needed to be evaluated earlier.
The hline in the violin plot was made more transparent and changed deprecated size to linewidth. Also increased its height and decreased its width. Looks less wonky now.
CN genomic view plot has cancer genes in red points with 0.5 opacity (so that you can see through them)

inst/scripts/icav1_download_and_run.R

I use this to automatically download results from GDS and run the report locally, and it works well (when there are no rounding bugs locally!)

umccr / RNAsum

Rmd refactor (part 10) #124