Open micaeljtoliveira opened 6 months ago
@dougiesquire Thanks again for the comments!
My first thought is that these tools are sufficiently general and useful beyond OM3 to justify moving them into their own package. Something to do now, later or never?
Yes, definitely more general than OM3, both the profiling and the config files. So at least we should consider renaming it to something like access-utils
or access-pyutils
? (This is actually a more general question: we are developing lots of ancillary code and tools for the ACCESS models. How should these be organized? Should this be discussed at a higher level?)
About splitting it up, I'm a bit hesitant. I see the value of having more modular and more focused packages, but on the other hand having lots of small packages increases the maintenance burden and dependency hell for users. Not easy to find the right balance... Currently there are less than 1000 lines of python in this repo (tests excluded), so maybe we can leave as is for now?
Since you asked for it, here's an attempt at a proposed restructure that I think makes things clearer for users and developers.
I like your proposal, but seems to go against recommended best practices? Although 3 levels is probably still okay?
You could even import all the parsers and analysis functions in profiling.init.py so that one does not need to go through the parser and analysis subpackages.
This would reduce the nesting by one level, right? So I would consider it a must have.
Yes, definitely more general than OM3, both the profiling and the config files. So at least we should consider renaming it to something like access-utils or access-pyutils? (This is actually a more general question: we are developing lots of ancillary code and tools for the ACCESS models. How should these be organized? Should this be discussed at a higher level?)
About splitting it up, I'm a bit hesitant. I see the value of having more modular and more focused packages, but on the other hand having lots of small packages increases the maintenance burden and dependency hell for users. Not easy to find the right balance... Currently there are less than 1000 lines of python in this repo (tests excluded), so maybe we can leave as is for now?
Yeah, I think we probably need to discuss this more broadly. I like access-utils
. I'm not sure the py
adds much, especially since there may end up being a CLI.
I like your proposal, but seems to go against recommended best practices? Although 3 levels is probably still okay?
Yeah, I probably tend to overdo the level of nesting. In this project in particular, there's a tendency towards more nesting because the scope is quite broad and it's helpful (to me at least) to have namespaces for the different "functionalities". We could do away with the parser
subpackage? E.g.
om3utils/
├── __init__.py
├── config_file
│ ├── __init__.py # import all read_*.py and write_*.py functions
│ ├── mom6_input.py
│ ├── nuopc_config.py
│ └── payu_config_yaml.py
├── profiling
│ ├── __init__.py # import Parsers and analysis functions
│ ├── base.py # contains ProfilingParser from profiling.py
│ ├── esmf.py # contains all of esmf_profiling.py and esmf_trace.py
│ ├── fms.py # contains all of fms_profiling.py
│ └── analysis.py # contains all of profiling_analyses.py and parse_profiling_data from profiling.py
└── utils.py
Or even put all the ProfilingParsers in their own submodule (though this may not scale very well...)?
om3utils/
├── __init__.py
├── config_file
│ ├── __init__.py # import all read_*.py and write_*.py functions
│ ├── mom6_input.py
│ ├── nuopc_config.py
│ └── payu_config_yaml.py
├── profiling
│ ├── __init__.py # import Parsers and analysis functions
│ ├── parser.py # contains ProfilingParser from profiling.py, all of esmf_profiling.py and esmf_trace.py and all of fms_profiling.py
│ └── analysis.py # contains all of profiling_analyses.py and parse_profiling_data from profiling.py
└── utils.py
Please note I don't feel strongly about any of this.
PS. @aidanheerdegen may have thoughts and opinions about all of this as he has interest in at least the profiling tools.
In this project in particular, there's a tendency towards more nesting because the scope is quite broad and it's helpful (to me at least) to have namespaces for the different "functionalities".
If we broaden the scope beyond OM3, then I think it would made sense for the top-level namespace to be an "organization" namespace, with functionalities as second level:
access/
├── __init__.py
├── config_files
│ ├── __init__.py # import all read_*.py and write_*.py functions
│ ├── mom6_input.py
│ ├── nuopc_config.py
│ └── payu_config_yaml.py
├── profiling
│ ├── __init__.py # import Parsers and analysis functions
│ ├── base.py # contains ProfilingParser from profiling.py
│ ├── esmf.py # contains all of esmf_profiling.py and esmf_trace.py
│ ├── fms.py # contains all of fms_profiling.py
│ └── analysis.py # contains all of profiling_analyses.py and parse_profiling_data from profiling.py
└── utils.py
Then one could do:
from access.profiling import ESMFProfilingParser
from access.profiling import (
parse_profiling_data,
scaling_efficiency
)
stats = parse_profiling_data(
run_dirs,
ESMFProfilingParser,
varname,
getvar
)
eff = scaling_efficiency(stats)
One thing I like with this approach is that it's easy to split the package later on without changing the namespace. It would also make it easy to add existing packages to the access
namespace, like for example esmgrids
, if we want to.
Would welcome any thoughts you might have @jo-basevi or @truth-quark
This looks great @micaeljtoliveira. My first thought is that these tools are sufficiently general and useful beyond OM3 to justify moving them into their own package. Something to do now, later or never?
Since you asked for it, here's an attempt at a proposed restructure that I think makes things clearer for users and developers. Of course, this is all a matter of opinion so feel free to ignore all or parts of it:
Then the profiling tools could be used as follows:
You could even import all the parsers and analysis functions in
profiling.__init__.py
so that one does not need to go through theparser
andanalysis
subpackages.(Note I'm not really sold on the name of the
config_file
subpackage)Originally posted by @dougiesquire in https://github.com/COSIMA/om3-utils/pull/3#pullrequestreview-2051568344