pepkit / looper

A job submitter for Portable Encapsulated Projects
http://looper.databio.org
BSD 2-Clause "Simplified" License
20 stars 7 forks source link

Excluding samples numerically / by index #324

Closed arpanda closed 1 year ago

arpanda commented 3 years ago

Hi, Initially, I ran 10 samples using the below project setup.

looper:
  output_dir: /home/my_path
  cli:
    run:
      limit: 10

So for the next run, how can I skip those first 10 samples. can you please help. Thanks.

stolarczyk commented 3 years ago

I don't think there's a straightforward way to do this. You could try to use the selector or toggle features to restrict which samples should be included.

looper run -h

...

sample selection arguments:
  Specify samples to include or exclude based on sample attribute values

  -g K, --toggle-key K               Sample attribute specifying toggle. Default: toggle
  --sel-attr ATTR                    Attribute for sample exclusion OR inclusion
  --sel-excl [E ...]                 Exclude samples with these values
  --sel-incl [I ...]                 Include only samples with these values

that said, this has been proposed before and will be implemented soon: https://github.com/pepkit/looper/issues/321

arpanda commented 3 years ago

Instead of skip, range for samples may be easier for the user for the limit. Like 1-10 or 10-15.

vreuter commented 1 year ago

https://github.com/pepkit/looper/issues/321#issuecomment-807286641

it's only really helpful for things that don't speak pipestat, once that's finished.

vreuter commented 1 year ago

@nsheff @donaldcampbelljr should this be a "skip-first-N" as suggested in #321 or rather instead / also accept a range as suggested by @arpanda ?

vreuter commented 1 year ago

There's also a question of for which subcommands this option should be available. For something like runp it clearly makes no sense, but for something like destroy, it's open for discussion / decision, I think.

nsheff commented 1 year ago

In discussion with @donaldcampbelljr we think doing a range makes sense; this could apply to both --skip and --limit.

So,

nsheff commented 1 year ago

There's also a question of for which subcommands this option should be available. For something like runp it clearly makes no sense, but for something like destroy, it's open for discussion / decision, I think.

I think having --limit and --skip on destroy would be useful.

vreuter commented 1 year ago

Looking at the output of looper called by itself to re-acquaint with the full collection of subcommands available, I think the following are the ones for which this idea definitely (or at least probably) makes sense. @nsheff @donaldcampbelljr please feel free to advocate for others' addition or removal of one or more listed here if you see it differently.