pepkit / looper

A job submitter for Portable Encapsulated Projects
http://looper.databio.org
BSD 2-Clause "Simplified" License
20 stars 7 forks source link

strange flag error message #303

Closed nsheff closed 3 years ago

nsheff commented 3 years ago
looper run examples/test_config.yaml --limit 1
Looper version: 1.2.1
Command: run
## [1 of 4] sample: RRBS_human; pipeline: RRBS
> Skipping sample because no failed flag found. Flags found: ['/home/ns5bc/dnameth/results_pipeline/RRBS_human/RRBS_failed.flag'

shouldn't that say "failed flag found" ?

ccrobertson commented 3 years ago

Hi, I am noticing similar behavior to this issue.

I just updated my version of looper to version 1.3.0 and it seems it doesn't recognized when a sample has failed anymore. For example, I ran a bunch of jobs yesterday. Most completed but a couple timed out, but when I try to rerun them I get the following:

[17 of 20] sample: 103; pipeline: PEPATAC

Skipping sample because no failed flag found. Flags found: ['/scratch/ccr5ju/T1DGC_DP3/processed/ATAC_UF_Run102/output/results_pipeline/103/PEPATAC_failed.flag']

nsheff commented 3 years ago

I think it may still be the error message is still incorrect, as reported above -- so those are actually failed flags that it found. did you try using looper rerun to just re-submit the failed ones?

nsheff commented 3 years ago

the error message has been fixed on dev, but this isn't released yet.

https://github.com/pepkit/looper/blob/0b3b20310d4d65ac23872bfc0db30f08d2a890a9/looper/conductor.py#L335

nsheff commented 3 years ago

https://github.com/pepkit/looper/commit/37518ce548fe7d94f857edf58f8cc3a0888c171a

nsheff commented 3 years ago

@ccrobertson did you ever get these jobs rerun to work?

nsheff commented 3 years ago

This bug keeps biting people -- we need to release a new version of looper that fixes it asap.

nsheff commented 3 years ago

Maybe we should add a note about looper rerun

ccrobertson commented 3 years ago

@ccrobertson did you ever get these jobs rerun to work?

@nsheff yes! thanks for checking. i used "looper rerun" and they did run. i think you're right that it's just a bug in the error message itself.