naobservatory / mgs-workflow

3 stars 2 forks source link

Is importing data into S3 in-scope for this project? #25

Open jeffkaufman opened 2 weeks ago

jeffkaufman commented 2 weeks ago

One place where we're still using the v1 pipeline is to import data into S3. This includes the hacky import-accession.sh as well as [mgs-restricted](https://github.com/naobservatory/mgs-restricted/)/import-*.sh for several of our partners. These are just linux scripts and don't use Nextflow.

@willbradshaw are you thinking these are out of scope for this project and should stay in mgs-pipeline / mgs-restricted?

willbradshaw commented 1 week ago

I do think of this as out-of-scope for the core workflow, and honestly an awkward fit for a pipeline-based approach in general. I think it should be the user's responsibility to get the data into a suitable state to run the pipeline (though I don't think that should necessarily mean it has to be in our S3 in a very specific structure).

That said, I'm not opposed to including some auxiliary scripts in the repo that users could use to do this in common use cases.