Open nathanhaigh opened 8 years ago
@nathanhaigh: we typically bump the versions of all extensions that were listed in the most recent previous version
The list of extensions typically grows as the need arises (since reinstalling only missing extensions is easy, cfr. http://easybuild.readthedocs.org/en/latest/Partial_installations.html#partial-installation-skip).
How large is the list of all Bioconductor extensions? If it's doable, we can include all.
Also, take a look at #1962, @verdurin has already done some work on Bioconductor 3.2, but it never got merged (and the PR is kind of broken now, it seems).
Could be over 1000 in it's entirety (https://www.bioconductor.org/packages/3.2/BiocViews.html#___Software). The core packages would be significantly less but I'm not sure how many at this stage.
The following should return all the package names for bioconductor:
source("https://bioconductor.org/biocLite.R")
all_group()
Well, are you up for maintaining an easyconfig file that lists ~1000 extensions? ;)
OK, so BioC have a file listing all their software packages in a Debian Control File.
I notice in the R-bundle-Bioconductor
easyconfig files that it states order of packages is important
. Is this because of package dependencies?
I really don't want to have to manually maintain the order of a list containing 1104 software packages! So, I'm wondering how we could use the dependencies specified in this file for automatically choosing the installation order but also identify CRAN dependencies.
As an aside, this DCF doesn't contain the 895 BioC AnnotationData packages including things like: GO.db and KEGG.db currently in the R-bundle-Bioconductor
easyconfig file.
DCF's for the AnnotationData and ExperimentData packages are available here: https://bioconductor.org/packages/3.2/data/annotation/src/contrib/PACKAGES https://bioconductor.org/packages/3.2/data/experiment/src/contrib/PACKAGES
I think installing all Bioconductor pkgs is... nuts. :)
I had a need for updating to 3.2, so I did a quick-and-dirty bump of all packages that are included already with:
for name in `grepi 'bioconductor_options),' WIP/R-bundle-Bioconductor-3.2-foss-2016a-R-3.2.3.eb | sed "s/^[^']*'//g" | sed "s/'.*//g"`;
do
version=`curl https://bioconductor.org/packages/3.2/bioc/html/${name}.html 2>/dev/null| grep -A1 Version | tail -1 | sed 's/.*<td>//g' | sed 's/<\/td>.*//g'`;
echo " ('$name', '$version', bioconductor_options),";
done
I'm likely going to have to deal with missing dependencies that were added in the updated versions, but at least this is a good start.
cfr. https://github.com/hpcugent/easybuild-easyconfigs/pull/2697
Can I ask a simple (stupid?) question: why can't biocLite()
be used within R to do the install of BioC packages?
@nathanhaigh As far as I know, biocLite
doesn't allow you to control versions (of either the packages themselves, or its dependencies), which goes against the spirit of EB where we (try to) version-fix everything (at least today).
Just a note that one of our groups insists that they need all the BioC packages, primarily because they occasionally receive requests for some of the obscure ones and they'd prefer not to have to install those on the fly.
That group should take ownership of their problem, to appreciate what they are asking for ;)
On Sunday, 17 April 2016, Adam Huffman notifications@github.com wrote:
Just a note that one of our groups insists that they need all the BioC packages, primarily because they occasionally receive requests for some of the obscure ones and they'd prefer not to have to install those on the fly.
— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/hpcugent/easybuild-easyconfigs/issues/2465#issuecomment-210970903
echo "sysadmin know better bash than english"|sed s/min/mins/ \ | sed 's/better bash/bash better/' # signal detected in a CERN forum
I would agree, but in fact they maintain it themselves now in their own private area, and we're trying to move people towards central infrastructure (three institutes are merging into one).
I know this thread is 8 years old, but in case anyone needs to build a new bioconductor version like me with the updated package versions:
Step 1: Install and load the BiocManager package if not already installed
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
library(BiocManager)
Step 2: Set the Bioconductor version to 3.19
BiocManager::install(version = "3.19")
Step 3: Retrieve the package information for Bioconductor 3.19
bioc_version <- "3.19"
bioc_repo <- BiocManager::repositories(version = bioc_version)
available_packages <- available.packages(repos = bioc_repo)
Step 4: Extract package names and versions
package_versions <- data.frame(
Package = rownames(available_packages),
Version = available_packages[, "Version"]
)
Step 5: Define a function to format each package entry
format_package_entry <- function(package, version) {
sprintf("('%s', '%s', {\n 'checksums': [''],\n }),", package, version)
}
Step 6: Apply the formatting function to each package entry
formatted_entries <- apply(package_versions, 1, function(row) {
format_package_entry(row["Package"], row["Version"])
})
Step 7: Write the formatted entries to a file
output_file <- "bioconductor_3_19_package_versions.txt"
writeLines(formatted_entries, con = output_file)
Print the file path for confirmation
cat("Package versions have been saved to:", output_file, "\n") file path for confirmation cat("Package versions have been saved to:", output_file, "\n")
This thread is still one of the top results, so if there is a more EB way to do this (maybe preserving the order of exts) let me know.
I'm looking at creating a new easyconfig for
R-bundle-Bioconductor-3.2
and was wondering how best to populate/update the packages and versions specified inexts_list
. Should this be installing all packages for bioconductor?