cerebis / meta-sweeper

Parametric sweep of simulated microbial communities and metagenomic sequencing.
GNU General Public License v3.0
10 stars 0 forks source link

Non-deterministic cache misses #34

Closed cerebis closed 8 years ago

cerebis commented 8 years ago

Using -resume, some tasks that are completed successfully get repeated. Repeating this an second time will produce a different subset of repeats.

Clearly something wrong in task caching.

cerebis commented 8 years ago

I've pinpointed this to the merging of sweep branches in the joinChannels function. There was a significant defect in joining, which would result in the loss of elements. Since order in the collections was not guaranteed between invocations, the lost elements were then random.

cerebis commented 8 years ago

This has been fixed in commit 39db05d0ec027bfc9c50e66e62d03c79e15d8227

koadman commented 8 years ago

just to confirm, this defect arises only when resuming, but not when a sweep is done in a single nextflow run?

cerebis commented 8 years ago

Currently this is a guess. I have not verified it on an actual run. Debugging the system is challenging, even when you think you've whittled it down.

With the current explanation, I expect it would have effected runs, in that there would have been fewer final results than intended. That should have stood out though and I did not notice it before finding the error.

Do you have results that you're worried about?

On Thursday, 25 August 2016, Aaron Darling notifications@github.com wrote:

just to confirm, this defect arises only when resuming, but not when a sweep is done in a single nextflow run?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/cerebis/meta-sweeper/issues/34#issuecomment-242130450, or mute the thread https://github.com/notifications/unsubscribe-auth/AFuniwkqdrFgQI2UoDihlGTB1s3SGrozks5qjHPIgaJpZM4JrpUq .