GenomicsAotearoa / metagenomics_summer_school

Course materials for the Genomics Aotearoa Metagenomics Summer School, to be hosted at the University of Auckland in Septermber
https://genomicsaotearoa.github.io/metagenomics_summer_school/
GNU General Public License v3.0
51 stars 30 forks source link

Update coverage normalisation script(s) #32

Closed mlhoggard closed 2 weeks ago

mlhoggard commented 1 year ago

MH to replace coverage normalisation steps with updated script(s) (summarise_counts.py; summarise_counts.R)

mlhoggard commented 1 year ago

Note: also check R dependencies for this. dynamic_require() doesn't seem to be correctly installing some libraries. This might be due to needing to manually select the CRAN location.

DininduSenanayake commented 1 year ago

@mlhoggard What are the R packages we need for this ?

mlhoggard commented 1 year ago

@DininduSenanayake . Ah, good point.

"dplyr" , "tibble" , "readr", "tidyr", "fuzzyjoin", "stringr", "matrixStats", "edgeR", "EDAseq"

edgeR and EDAseq are installed via BiocManager, so that's also required for that step. The script is supposed to install all these packages if they're not already available, but it seems to be buggy for those two I think.

Worst case, we could include a step in the docs that installs the dependencies. But I'll also do a test run with the workshop data, as from memory, depending on how it's run it won't require the R part of the script for this particular process anyway.

DininduSenanayake commented 1 year ago

@mlhoggard I think you were using the normal R module, correct ?. If yes, use R-bundle-Bioconductor/3.13-gimkl-2020a-R-4.1.0 OR R-bundle-Bioconductor/3.15-gimkl-2022a-R-4.2.1 It has all but fuzzyjoin installed. I can add latter to that module later

mlhoggard commented 1 year ago

@DininduSenanayake Ah brilliant! That's great to know thanks.

DininduSenanayake commented 2 weeks ago

@mlhoggard I suppose we can close this issue as we didn't have any issues during the last two rounds ?