Alter the default behavior for organizing outputs

mabruzzo commented 5 months ago

Overview

This PR changes the way that Cholla organizes output files.

Previously all outputs were written into a single output directory.

This is fine when you have a small number of outputs.
However, when you have a lot of outputs, it can be hard to read (and identify the name of the most recent output).
Plus its a little tedious to selectively delete files

This PR changes the default output behavior. Now we group all outputs written at a single simulation cycle into a single directory. To use the older behavior you can set the new legacy_flat_outdir runtime parameter to 1 (in the future, I think it would be good to remove the older behavior completely).

More details

Output file paths traditionally followed the following template:

    "{outdir}{nfile}{pre_extension_suffix}{extension}[.{proc_id}]"

where each curly-braced token represents a different variable. In detail:

{outdir} is the parameter from the parameter file. The historical behavior (that we currently maintain), if this is non-empty, then all characters following the last '/' are treated as a prefix to the output file name (if there aren't any '/' characters, then the whole string is effectively a prefix.
{nfile} is the current file-output count.
{pre_extension_suffix} is the pre-hdf5-extension suffix. It's the suffix that precedes the file extension (or {extension})
{extension} is the filename extension. Examples include ".h5" or ".bin" or ".txt".
{proc_id} represents the process-id that held the data that will be written to this file. In non-MPI runs, this will be omitted.

The new default behavior is to create

    "{outdir}/{nfile}/{nfile}{pre_extension_suffix}{extension}[.{proc_id}]"

where the the significance of each curly-braced token is largely unchanged. There are 2 things worth noting:

all files written at a single simulation-cycle are now grouped in a single directory
{outdir} never specifies a file prefix. When {outdir} is empty, it is treated as "./". Otherwise, we effectively append '/' to the end of {outdir}

Implementation Details

While doing all of this, I took the opportunity to consolidate the logic for formatting output filenames. Previously the logic was scattered throughout the codebase. Now, the logic is encapsulated by a class called FnameTemplate.

I acknowledge that it looks a little funny that we construct an FnameTemplate instance everywhere we want to determine an output filename (rather than just defining a function). I mostly did this in anticipation of some future refactoring of the io section of the codebase that I plan to do (basically it involves storing the output-related parameters outside of the global Parameter struct). In the short term, I think it would actually be nice to pass around a single FnameTemplate instance to all of the functions that use it, but I'm trying not to change function interfaces too much in the PR (I need to backport this change to the particle-feedback branch I have, which branched near the start of last summer). If you strongly dislike this design choice, I can change it.

Other Thoughts:

Currently the output filenames differ between a simulation compiled with MPI run with a single process compared to a simulation compiled without MPI. All of the filenames written by the MPI version will end in .h5.0. The non-MPI version writes filenames terminating in .h5. Going forward, I actually think it would be better to terminate the filenames with .h5.0 in the non-MPI case (for the sake of consistency). But I think that's a topic for another day

brantr commented 5 months ago

This PR affects all downstream analysis. I agree it's "better", but it will have substantial downstream impact (either analysis scripts will be changed to reflect the output structure, or scripts will need to be used to move files around).

I think the change to serial file naming is terrific, if that matters.

evaneschneider commented 5 months ago

This PR affects all downstream analysis. I agree it's "better", but it will have substantial downstream impact (either analysis scripts will be changed to reflect the output structure, or scripts will need to be used to move files around).

I think the change to serial file naming is terrific, if that matters.

Yes for sure! This was on the CAAR to-do list for a long time and we never got around to it. But since @mabruzzo added a flag to maintain the legacy behavior for now, I think the impact should be minimal for current users. This is a good reminder though that we should double-check the concatenation and plotting examples in the python scripts directory to make them consistent.

bcaddy commented 5 months ago

I think that this will break the concatenation scripts. I'm happy to help/advise on how to update them for this but I think it should be part of the same PR or at least those changes should be in flight before this is merged.

mabruzzo commented 5 months ago

If it would be less controversial, we could make the default-behavior the default choice. With that said, it would probably hinder efforts gradually phase out the old system.

It's unclear why the continuous integration failed. There doesn't seem to be any useful error messages when I click details

bcaddy commented 5 months ago

If the CI stuff fails randomly try just rerunning it. Sometimes errors like that are just the file system or jenkins having issues.

evaneschneider commented 5 months ago

I think that this will break the concatenation scripts. I'm happy to help/advise on how to update them for this but I think it should be part of the same PR or at least those changes should be in flight before this is merged.

It should just be a matter of changing a single line of code in each of them, right? If so, I'm fine with just doing that as part of this PR. Although it's actually not clear to me from a quick glance that the concatenation scripts themselves would break, since they take the source directory as an input. It seems like it's the dask templates that would need to be edited.

mabruzzo commented 5 months ago

I think that this will break the concatenation scripts. I'm happy to help/advise on how to update them for this but I think it should be part of the same PR or at least those changes should be in flight before this is merged.

It should just be a matter of changing a single line of code in each of them, right? If so, I'm fine with just doing that as part of this PR. Although it's actually not clear to me from a quick glance that the concatenation scripts themselves would break, since they take the source directory as an input. It seems like it's the dask templates that would need to be edited.

Things will break since they try to iterate over all of the outputs. But fixing that is easy enough. I'll make of point of doing that

mabruzzo commented 5 months ago

If the CI stuff fails randomly try just rerunning it. Sometimes errors like that are just the file system or jenkins having issues.

Is there any way to do that without pushing another commit?

evaneschneider commented 5 months ago

I think that this will break the concatenation scripts. I'm happy to help/advise on how to update them for this but I think it should be part of the same PR or at least those changes should be in flight before this is merged.

It should just be a matter of changing a single line of code in each of them, right? If so, I'm fine with just doing that as part of this PR. Although it's actually not clear to me from a quick glance that the concatenation scripts themselves would break, since they take the source directory as an input. It seems like it's the dask templates that would need to be edited.

Things will break since they try to iterate over all of the outputs. But fixing that is easy enough. I'll make of point of doing that

Ah yes good point, I'm not as familiar with the newer behavior. Thanks!

cholla-hydro / cholla