Closed eric-czech closed 4 years ago
Snakemake has worked fairly well so far or I was at least able to convert a couple bgen chromosomes using a kubernetes cluster with it. The remote file support is pretty clearly an afterthought in the design and doesn't work particularly well with GS, but it's still usable. It doesn't seem to support directories (snakemake#576) which is definitely annoying.
In retrospect, I wish I had started with Nextflow instead but at this point the current pipeline is still reasonable so I won't backtrack.
Nextflow or Snakemake seem like the obvious choices. This pipeline will be:
I have no aspirations of making this a cloud-agnostic pipeline and I think Nextflow and Snakemake are similarly matched in their GCP support. Nextflow appears to take control over deploying individual VMs when not using a cluster though (see https://www.nextflow.io/docs/latest/google.html#process-definition) and I don't see a similar feature in Snakemake. I'm not sure how much I want to trust that long term since the project appears to be driven almost entirely by a single contributor.
Overall Snakemake appears to be simpler, not require a jvm + groovy, and adopts a model where users are responsible for creating resources so I'm leaning towards it at the moment.