harvardinformatics / snpArcher

Snakemake workflow for highly parallel variant calling designed for ease-of-use in non-model organisms.
MIT License
63 stars 30 forks source link

Rerunning a species set with updating samples #52

Closed tsackton closed 3 months ago

tsackton commented 2 years ago

By default, if a species set has finished running successfully and then you update the sample sheet to include more SRA accessions / BioSamples, snakemake will not catch this and will not do anything.

With some preliminary testing, it seems that adding -R /`snakemake --list-input-changes` to the command will work in most cases, as this will force snakemake to rerun any jobs with updated inputs as well as downstream jobs impacted by them. I have some test cases running now; there is also the option of -R `snakemake --list-params-changes` which seems to give similar but not quite identical changes.

The real question is how we should document / approach this, and if we should consider any code changes to facilitate this use case (e.g., automatically looking for changed input files somehow).