nf-core / tools

Python package with helper tools for the nf-core community.
https://nf-co.re
MIT License
241 stars 189 forks source link

Using CVMFS for storing reference genome data #2058

Open amizeranschi opened 1 year ago

amizeranschi commented 1 year ago

Description of feature

A nice alternative to iGenomes and RefGenie could be CVMFS, developed by CERN. Here's a description from Galaxy Project's relevant documentation page:

The CernVM File System provides a scalable, reliable and low-maintenance software distribution service. It was developed to assist High Energy Physics (HEP) collaborations to deploy software on the worldwide-distributed computing infrastructure used to run data processing applications. CernVM-FS is implemented as a POSIX read-only file system in user space (a FUSE module). Files and directories are hosted on standard web servers and mounted in the universal namespace /cvmfs.

The people behind the Galaxy Project are hosting their own CVMFS instance, with plenty of reference genome data already available. For more details about how things are structured in there, see this page.

FriederikeHanssen commented 1 year ago

Hi! cool idea. I think this concerns all of nf-core. @maxulysse what would a good repo be? tools?

pditommaso commented 1 year ago

Yeah, it's a good idea bit it has nothing to do with pipeline implementation either 😄. However, we are considering to add this as a feature to https://tower.nf