Looper process streamlining

nsheff commented 1 year ago

A collection of problems I ran into when trying to use looper for a project:

[x] #417
[x] #418
[x] #421
[x] #419
[x] #420
[x] #422
[x] there was no clear way to pull from pephub into Python (I guess pephubclient can do this? Possibly just need docs)
[x] I have lumpn, which is number of commands to batch into one job. Another possibility would be the inverse; I want to batch my samples into m jobs. So, this would just be equivalent to a --lumpn using the number of samples divided by m. Is this worth a new option? I want m jobs, I have s samples; use --lumpn s/m. 100 samples, 8 jobs, so --lumpn 12. 80 samples, 4 jobs, --lumpn 20. (https://github.com/pepkit/looper/issues/415)
[ ] I would like to update my PEP with information about a sample: https://github.com/pepkit/pipestat/issues/125
[x] #423
[x] #424
[ ] #498 { echo "str1"; echo "str2"; echo "str3"; } & -- this could become a parallel lump mode.

donaldcampbelljr commented 11 months ago

looper variable namespaces need to be updated; in particular, does the looper namespace get what's in the .looper config file now? -> Q: looper namespace was relative to PEP is it now the looper config? Think everythign in the looper config is in the looper namespace. See #423

looper.pep_config -> command_template -> it works if passing pep that is local, if using a registry path it doesn't appear to work

how do to parallel-process files with looper locally -> originally a divvy idea. issue: 100 files -> divvy submits to cluster no problem, if local they will run serial. Could run in background process using ampersand. in command shell script with &. So can we lump 100 samples (in 10 background processes), new divvy template to accomplish that.

donaldcampbelljr commented 11 months ago

there was no clear way to pull from pephub into Python (I guess pephubclient can do this? Possibly just need docs)

For using looper with pephub looks like there is documentation: https://looper.databio.org/en/dev/hello-world-pephub/

However, it does appear that our pephubclient api docs could be expanded. There is currently only a light readme.

donaldcampbelljr commented 3 months ago

Because we've solved the majority of the issues here and the remaining two have child issues tracking them, I will close this parent issue.

pepkit / looper

Looper process streamlining #414