LLNL / maestrowf

A tool to easily orchestrate general computational workflows both locally and on supercomputers
https://maestrowf.readthedocs.io
MIT License
134 stars 43 forks source link

output directory naming #300

Open crkrenn opened 4 years ago

crkrenn commented 4 years ago

@jsemler & @FrankD412,

I know I've brought this up before, but it may have been by email.

My request is to enable more flexible naming options for output directories.

Currently output directories are named output_path/prefix_date-time/step/instance

I would like the ability to rearrange this to output_path/date/prefix_instance/step or output_path/date/run_INDEX/instance/step, where run_INDEX is a unique run_directory name.

crkrenn commented 4 years ago

@FrankD412, (cc: @jsemler)

Would you consider a pull request that enabled the user to define the directory structure?

The default would be "{{outputpath}}/{{prefix}}{{year}}{{month}}{{day}}-{{hour}}{{minute}}{{second}}/{{step}}/{{instance}}", and this would reproduce your current naming system.

The directory template would also support an {{INDEX}} field that would increment a counter in a thread safe way.

The default behavior would be exactly as it is now.

-Chris

PS. I'm not the only end user who does not like the "{{outputpath}}/{{prefix}}{{year}}{{month}}{{day}}-{{hour}}{{minute}}{{second}}/{{step}}/{{instance}}" naming scheme and who would like something easier to use for real world work.

FrankD412 commented 4 years ago

@crkrenn -- Sorry, my time has been consumed elsewhere and I didn't mean to give the impression that I was ignoring this issue. I'm open to a PR on this feature, but we need to discuss where this goes and the format that this should take and what options a user might want. We have a meeting next Tuesday, so we can pick it up then.

crkrenn commented 4 years ago

Hello Frank, @dinatale2

Are all study steps stored in a flat directory structure? That's what the code seems to suggest, but I am not completely sure. (I am starting the directory naming work now...)

In the example, below, the run directories hello_world and bye_world are in the same parent directory.

Thanks!

-Chris

(venv) ➜  hello_world git:(feature/links) ls -ltr sample_output/hello_world/hello_bye_world_20201013-14*4
total 56
drwxr-xr-x@ 3 crkrenn  staff    96 Oct 13 14:43 logs
-rw-r--r--@ 1 crkrenn  staff   733 Oct 13 14:43 hello_bye_parameterized.yaml
-rw-r--r--@ 1 crkrenn  staff  2364 Oct 13 14:43 hello_bye_world.study.pkl
-rw-r--r--@ 1 crkrenn  staff    12 Oct 13 14:43 batch.info
drwxr-xr-x@ 6 crkrenn  staff   192 Oct 13 14:43 meta
drwxr-xr-x@ 6 crkrenn  staff   192 Oct 13 14:43 hello_world
drwxr-xr-x@ 6 crkrenn  staff   192 Oct 13 14:44 bye_world
-rw-r--r--@ 1 crkrenn  staff  7063 Oct 13 14:44 hello_bye_world.pkl
-rw-r--r--@ 1 crkrenn  staff  1379 Oct 13 14:44 status.csv
-rw-r--r--@ 1 crkrenn  staff   146 Oct 13 14:44 hello_bye_world.tx
FrankD412 commented 4 years ago

@crkrenn -- Yeah, the directory structure in its current state is usually flat. The deepest hierarchy is the fact that step folders can have their parameterized equivalents underneath.