GEOS-ESM / swell

Workflow system for coupled data assimilation applications
https://geos-esm.github.io/swell/
Apache License 2.0
14 stars 4 forks source link

(SE) Use `swell_static_files_user` as an alternate location for all tasks #360

Closed Dooruk closed 1 month ago

Dooruk commented 4 months ago

Currently, swell_static_files_user is only used in generate_b_climatology_by_linking task. For others tasks and CI tests, the generic location for static files is /discover/nobackup/projects/gmao/advda/SwellStaticFiles. However, we could give users ability to use swell_static_files_user for their own static files as an alternative to the generic location.

ashiklom commented 4 months ago

I found references to swell_static_files in the following tasks:

From what I can tell, in GenerateBClimatologyByLinking, the current behavior is that both swell_static_files_main and swell_static_files_user are searched, with preference given to swell_static_files_user and then, any additional files not present in the user directory are linked from swell_static_files_main.

Do we want similar behavior for the other tasks as well? I.e., We effectively pull in every file from swell_static_files_main and then treat any files present in swell_static_files_user effectively as optional overrides?

One key issue with this behavior is that there is no way to force swell to not copy over files from swell_static_files_main that it otherwise would. For example, if your swell_static_files_user contains everything in swell_static_files_main except a few files that you want excluded, those files will be included anyway --- there's no way to specify that certain files need to be skipped. In practice, that may not be a problem, but I wanted to flag it and give you a chance to weigh in.

Note that an insidious version of this is if you have the same files but the naming and/or directory structure of the files changes between swell_static_files and swell_static_files_user. Then, you may end up with multiple files/directories that, in practice, contain duplicates of the same data.

Personally, my vote would be to just define a task-specific static_files parameter with some default value that could be overridden and then force the user to provide a complete set of required static files even if it means copying. That's more work on the user in exchange for clarity of behavior --- you always know exactly where any given swell environment got its files. Whereas, under the current rules, a user may be surprised to see "extra" static files that were pulled in from the global location. I.e., This would mean changing the behavior of GenerateBClimatologyByLinking to something like:

        swell_static_files = self.config.swell_static_files()
        swell_static_files_user = self.config.swell_static_files_user(None)
        if swell_static_files_user is not None:
            swell_static_files = swell_static_files_user
# [...]
        source_paths = [os.path.join(swell_static_files, ...)]
        link_all_files...(self.logger, source_paths, target_path)

Alternatively, maybe the point is that, if you want to provide a complete replacement, you just set the swell_static_files parameter (as an override?) and swell_static_files_user is specifically for this "file merging" behavior.

Let me know what you think.

Dooruk commented 4 months ago

Personally, my vote would be to just define a task-specific static_files parameter with some default value that could be overridden and then force the user to provide a complete set of required static files even if it means copying. That's more work on the user in exchange for clarity of behavior --- you always know exactly where any given swell environment got its files.

I thought about this and read your thoughts. I think users should choose only one of these options (user or global). As you mentioned, combining sources make it very confusing and hard to follow which files/configurations are being used. Also, in a way, users would get familiar with how to use SWELL and where files are coming from (given that we should provide documentation on that 😄). I personally learn the most when the code fails!

Alternatively, maybe the point is that, if you want to provide a complete replacement, you just set the swell_static_files parameter (as an override?) and swell_static_files_user is specifically for this "file merging" behavior.

I think user still should have full control of their local static folder and make their changes there. That's how Ricardo and I have been operating.

This is a separate concern but during the SWELL development meeting there was a brief discussion on version control with the swell_static_files. Sounds like there is no straightforward solution to that since it requires handling many different large file formats. My suggestion for now is that we can name the static files as the SWELL version and keep the previous static folders up to 4-5 most recent versions.

ashiklom commented 3 months ago

I thought about this and read your thoughts. I think users should choose only one of these options (user or global). As you mentioned, combining sources make it very confusing and hard to follow which files/configurations are being used. Also, in a way, users would get familiar with how to use SWELL and where files are coming from (given that we should provide documentation on that 😄). I personally learn the most when the code fails!

Got it. In that case, I think the correct implementation is not to have a separate swell_static_files_user, but rather to make swell_static_files an override-able configuration option (with an associated questionary question) that has the current swell_static_files value as the default. That will be a minor breaking change to GenerateBClimatologyByLinking, but shouldn't affect any of the other tasks.

This is a separate concern but during the SWELL development meeting there was a brief discussion on version control with the swell_static_files. Sounds like there is no straightforward solution to that since it requires handling many different large file formats. My suggestion for now is that we can name the static files as the SWELL version and keep the previous static folders up to 4-5 most recent versions.

Agreed that we should figure this out, and also that it should be a separate issue. I'll open that now.

ashiklom commented 3 months ago

@Dooruk Sorry, one more question: Is there a use case where users would use swell_static_files_user for some tasks but swell_static_files for other tasks? Or should users really be using the same swell_static_files for all tasks within a single swell run?

If the latter, this becomes trivial because it already works --- we just drop swell_static_files_user altogether and just make people use the swell_static_files. If you need to change the static files, you just set the swell_static_files override.

If there are situations where, within a single swell run, you might want to use swell_static_files for some tasks but swell_static_files_user for other tasks, then we really do need a swell_static_files_user. That's also not hard to do.

Dooruk commented 3 months ago

The more I think about it the more indecisive I get, but let me share my thoughts.

Short answer, yes an option to pick which tasks one would like to use their own folder would be useful, but I just worry this complicates usage even further. R2D2 does some of what you are proposing elegantly with the fetch order, but does not specify tasks.

In the case of GenerateBClimatology and StageJedi, one might be using SWELL in coupled mode and might want geos_atmosphere files from the shared static folder while using geos_marine files from their local folder.

In the case of GetGeosRestart and PrepGeosRunDir users should be able to point to their static folder with GEOS experiment settings and restarts. Actually, that should allow for external folders too not just the static folder, perhaps I can rewrite that task at some point.

There would be more tasks using this in the future.

ashiklom commented 3 months ago

Understood, thanks! For now, I'll implement swell_static_files_user for all tasks that currently support swell_static_files, with this logic (pseudocode):

swell_static_files = Config.swell_static_files()
swell_static_files_user = Config.swell_static_files_user()
if swell_static_files_user:
  swell_static_files = swell_static_files_user

do_stuff(swell_static_files)

I'm not thrilled about this design for multiple reasons, but pending large scale overhaul of how we do configs (per #314), this is probably the easiest thing to do.