Extend vignettes, e.g. including multi-sample fitting and explaining the required dimensions and contents of the input matrices (now it’s quite ambiguous), as well as all the post-sampling functions (below). Add the browseVignettes function to the GitHub Readme.
TODO: Improve documentation of all functions, including "Details" and "Value" fields.
--
Outline of the vignettes:
Example 1: fitting to simulated sample:
TODO: Plot the signatures to use for the simulation, using plot_spectrum() and par(mfrow=...), before fitting.
Add a comment saying that the "emu" method will be shown in the next example (without running it).
Example 2:
Do a proof of concept by fitting COSMIC signatures to a simulated sample. Show exposures and reconstructed spectrum.Done
Load 21 breast cancers data (mutations_21bc). Use the build_catalogues() function to build the 21 breast cancer catalogues (and suggest ways of obtaining the trinucleotide contexts, e.g. BSGenome or the VCF INFO field).Done
Plot 21 genomes in multiplot grid.Done
Fit signatures to data using "emu" instead of "nmf".
Plot extracted exposures.
Plot reconstructions (add comment about how to output to PDF).
Extract a range of signatures (2:10) using "emu" with small sample size (include a comment showing "nmf"). Show GOF.
Extract again for the estimated best number of signatures, with more samples.
Plot extracted signatures and exposures.
Plot reconstructions (add comment about how to output to PDF).
Extend vignettes, e.g. including multi-sample fitting and explaining the required dimensions and contents of the input matrices (now it’s quite ambiguous), as well as all the post-sampling functions (below). Add the browseVignettes function to the GitHub Readme.