Closed ahstram closed 1 year ago
Using deep 1
as default creates a conflict when pulling a specific revision
. I've modified this behaviour adding a -deep <n>
command line option to specific the deep of the clone in an explicit manner. See https://github.com/nextflow-io/nextflow/commit/b44b64533dcbd5a65a9561f176e2e4ce56a2ed8a.
Hi there,
One Nextflow feature I really appreciate is the ability to run a pipeline from our local Personalis Bitbucket server, using a command such as:
nextflow run nxf/hades -hub PSNL -v 1.3.7 ...
Which allows for much greater user friendliness, while also removing the need to deploy every possible version of our pipeline to a filesystem.
That being said, we generally do set the
NXF_ASSETS
variable on a pipeline-run basis (i.e., each sample we run gets its own clone of thenxf/hades
repo). This results in thousands of copies of our pipelines being made on a weekly basis, many of those clones being of the same version. Having a pipeline-run specific clone does make it easier to have multiple pipeline versions running at once, and also makes for easier "manual hotfixing" if necessary (i.e., someone goes in and manually edits the clone to allow a failed job to succeed, etc).Overall, the best compromise we've found is to use jgit's "setDepth()" functionality, which has been available since 6.3: https://javadoc.io/doc/org.eclipse.jgit/org.eclipse.jgit/latest/org.eclipse.jgit/org/eclipse/jgit/api/CloneCommand.html
We use
setDepth(1)
, which means each of our clones has a complete checkout of the appropriate version of our pipeline, but no git history, which greatly reduces disk usage & load on our Bitbucket server.I think it would be great to update Nextflow to JGit 6.3+ and use
setDepth(1)
by default, as I wouldn't expect that most users usingnextflow run
ornextflow pull
will expect their full git history to be available.I will submit a PR for your consideration.
Thank you,
Alexander Stram Associate Director, Bioinformatics Engineering Personalis, Inc