Open poldrack opened 2 years ago
Ashley and I were thinking about essentially deleting workflow_dict
and instead just passing the full args dict to each script, which can then take whatever items it needs. for example args already has .preproc_settings['script'] (where script == 'fictrac_qc for example), so that script can directly pull the args it needs from there. Then instead workflow_dict would become just a list of jobs to run in order via run_preprocessing_step
.
We do see that for some steps 'basedir' (in the workflow_dict) is set to args.basedir, while others set it to args.dir, and we also see that 'dir' (in the workflow_dict) is sometimes set to args.dir or args.process. This is related to issue #15, but looking forward not having the workflow_dict should not be a problem in sorting out directories, right?
What do you think?
Interesting! Could this handle the cases where the same process is run twice with different settings or inputs, ala pca or regression? I think the dir naming issues should be handled when we change to a consistent api for all workflow elements as we discussed today
-- Russell A. Poldrack Albert Ray Lang Professor of Psychology Associate Director, Stanford Data Science Director, SDS Center for Open and Reproducible Science Building 420 Stanford University Stanford, CA 94305
@. @.> http://www.poldracklab.org/
I think so! Using PCA as an example, the settings.json would list the PCA dict twice, but each dict would have different values to run the two different PCA styles. For example the first dict could be
"PCA": { "script": "dimensionality_reduction.py", "time_hours": 8, "label": 'resid'}
, the second dict could replace the label and whatever other key/values should be different. (These would all exist in the args.preproc_settings still).
but wouldn't that require the same key to show up multiple times in the dict (which isn't possible)?
On Mon, Jun 13, 2022 at 3:47 PM Luke Brezovec @.***> wrote:
I think so! Using PCA as an example, the settings.json would list the PCA dict twice, but each dict would have different values to run the two different PCA styles. For example the first dict could be "PCA": { "script": "dimensionality_reduction.py", "time_hours": 8, "label": 'resid'}, the second dict could replace the label and whatever other key/values should be different.
— Reply to this email directly, view it on GitHub https://github.com/ClandininLab/brainsss2/issues/30#issuecomment-1154523616, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGUVEGF73MONDPG7KCIBMLVO6273ANCNFSM5YOTQFHA . You are receiving this because you authored the thread.Message ID: @.***>
-- Russell A. Poldrack Albert Ray Lang Professor of Psychology Associate Director, Stanford Data Science Director, SDS Center for Open and Reproducible Science Building 420 Stanford University Stanford, CA 94305
@. @.> http://www.poldracklab.org/
currently the workflow settings repeat the same operations many times. in addition, there are some settings like basedir that one might want to set in a settings file. come up with a cleaner way to integrate stored settings with args settings