Open zhangshixuan1987 opened 3 weeks ago
@chengzhuzhang @forsyth2 : Hi Jill and Ryan, I closed the https://github.com/E3SM-Project/zppy/pull/624 as some of the fixes are made on the issues pointed our by you. I open this new pull request for the continuing discussion.
@zhangshixuan1987
From the original PR#624, I ran into an env problem that suggests I can't access your build of pcmdi_metrics_master conda environment. All other tasks like ts
ran.
EnvironmentNameNotFound: Could not find conda environment: pcmdi_metrics_master
However In the new PR I ran into an error much ealier, right after issuing the zppy command line, as below:
(zppy_dev_24) [ac.zhang40@chrlogin2 test_zppy]$ zppy -c post.v3.LR.historical_pmp.cfg
Traceback (most recent call last):
File "/home/ac.zhang40/y/envs/zppy_dev_24/bin/zppy", line 5, in <module>
from zppy.__main__ import main
File "/home/ac.zhang40/y/envs/zppy_dev_24/lib/python3.9/site-packages/zppy/__main__.py", line 18, in <module>
from zppy.pcmdi_diags import pcmdi_diags
File "/home/ac.zhang40/y/envs/zppy_dev_24/lib/python3.9/site-packages/zppy/pcmdi_diags.py", line 8, in <module>
from zppy.utils import (
ImportError: cannot import name 'checkStatus' from 'zppy.utils' (/home/ac.zhang40/y/envs/zppy_dev_24/lib/python3.9/site-packages/zppy/utils.py)
I think the error is caused by the conflicting zppy/__main__.py
In any case, I think i now have a better understand of your implementation, after testing and reviewing the results. And perhaps we can have a chat sometime next week about design details.
@chengzhuzhang: Hi Jill, when I submitted this new PR, I did a code rebase to make sure that my code is consistent with the zppy master branch. However, I just found that a certain amount of the changes have been made in zppy-master branch. Due to these changes, I need to make modifications to my pcmdi workflow. Overall, please hold, and I may need to resubmit a PR again. I apologize for the inconvenience.
@zhangshixuan1987 Oh, no worries! I didn't realize it is still a draft. Thanks for the heads-up. I'm happy to test again when it is ready. No hurry.
we can have a chat sometime next week about design details.
Due to these changes, I need to make modifications to my pcmdi workflow. Overall, please hold, and I may need to resubmit a PR again. I apologize for the inconvenience.
I've unfortunately been busy with other priorities, but I've been trying to follow this work at a high level. I would like to again raise my concern about repeating design decision mistakes that have caused issues in the global_time_series
task. It may be a good idea to discuss design further before proceeding with more work.
However, I just found that a certain amount of the changes have been made in zppy-master branch.
Yes, I mentioned in https://github.com/E3SM-Project/zppy/pull/624#discussion_r1800306399 that #628 made some changes to clean up the zppy
code base. That's probably the biggest change.
I'm hesitant to make too many comments without fully reviewing and running the latest code myself, but my specific design decision concerns at the moment are the following. Perhaps I just need more context though, which we can discuss in a meeting.
I agree strongly that it would be much easier to navigate the results with a viewer page!
It appears we are once again adding a de facto analysis package inside zppy
. #616 is really turning the global_time_series
task into that -- a piece of software with scope and environment distinct from zppy
that is however fundamentally integrated into zppy
. This has caused two major issues:
global_time_series
. #616 is adding unit tests for the global_time_series
. I have to run those tests from a different environment than the zppy
dev environment because that code runs in the environment that the global_time_series
task is running in, not the environment zppy
is running in. This convolutes an already complicated testing scheme -- e.g., see #634 for the process required to do weekly integration testing.global_time_series
code has different dependencies than the zppy
code. This complicated @xylar's question of what code exactly had numpy
dependencies (global_time_series
needs numpy
, but zppy
does not). Furthermore, this means there isn't really a dev environment for global_time_series
. And I suspect a similar issue here is responsible for the EnvironmentNameNotFound: Could not find conda environment: pcmdi_metrics_master
error mentioned in https://github.com/E3SM-Project/zppy/pull/640#issuecomment-2452618239.global_time_series
developed these issues over time as it grew from a small task into a complex pseudo-package. We need to discuss the design decisions for implementing the PMP task, before repeating known issues. zppy
is already a complex piece of code requiring python > bash
interaction. These built-in analysis packages add a third layer of complexity: python > bash > python
.
I've unfortunately been busy with other priorities, but I've been trying to follow this work at a high level. I would like to again raise my concern about repeating design decision mistakes that have caused issues in the
global_time_series
task. It may be a good idea to discuss design further before proceeding with more work.I'm still not convinced that we should consider a stand-alone package for the
global_time_series
, mostly because the potential maintenance overhead. Also theglobal_time_series
capability is very specific about its input: the global average files are generated with a relatively recent new NCO capability. It doesn't feel like we need a new package for just visualization global time-series and to aggregate the results.
My view of zppy is that it not only provide interface to connect multiple diagnostics tools, it also closes any gaps that these packages won't cover but E3SM requires. In the case of zppy-PMP workflow, there is missing pieces from PMP, including decoupled plotting capability and no viewer is generated. It feels that it is inevitable that we need to use zppy to close such gaps.
Another thing i would note is that there are several scripts use cdms
or other CDAT utilities in this PR, which we should migrate away .
My view of zppy is that it not only provide interface to connect multiple diagnostics tools, it also closes any gaps that these packages won't cover but E3SM requires.
If that is going to continue to be the purview of zppy. I think there needs to be a yaml file or something similar that defines the dependencies and their constraints for each tool like global_time_series
that has unique dependencies. I need somewhere I can go to know what packages E3SM-Unified needs to have to support not just zppy, the workflow management tool, but its pseudo-sub-packages.
@xylar yes, dependencies such as numpy
, and the upcoming outputviewer
, should be explicitly listed in a yaml file.
I think Ryan's point is that global_time_series is currently poorly maintained because its buried inside zppy. The cost of making it a separate package is offset by the benefit of better testing.
I agree with both @rljacob and @forsyth2, I don't think it's great if zppy becomes a kitchen-sink package. The extra maintenance is already there and breaking our distinct packages should provide transparency for users and developers. I realized we're short on staff. But bundling packages together doesn't really solve that problem, it just obscures it.
In general I think zppy should return to being a workflow manager and not try to also be the tool that adapts all other tools for E3SM's needs. That's really messy to try to bundle into a single package.
hmmm, is it possible to make the global_time_series capability more modular and testable in zppy before considering packaging it as a stand-alone package? I don't know if such a package is useful beyond zppy.
@chengzhuzhang @xylar @forsyth2: Hi all, I may not fully understand all your discussions here. The EnvironmentNameNotFound: Could not find conda environment: pcmdi_metrics_master
is because the pcmdi_metrics_master is a customized environment used for the development of zppy-pcmdi workflow for the following reasons:
To me, The above issue will not be a problem anymore once the PCMDI is updated in the e3sm-unified environment.
One step further, I currently used the ZPPY preferred method to load the conda environment pcmdi_metrics_master by adding following lines in the ".cfg" file:
I think that @chengzhuzhang has a problem calling "conda activate pcmdi_metrics_master" as there is no such coda environment installed in her local machine. However, this is not because there is a module called "pcmdi_metrics_master" within any Python scripts of the zppy-pcmid workflow. My point here is that the above method to activate the customized PCMDI library will not cause any issues within zppy-pcmdi workflow.
@zhangshixuan1987, you are right. We are hijacking your PR to have what has meandered into a very important discussion but one that is not related.
Let's move the broader discussion to https://github.com/E3SM-Project/zppy/discussions/641 and keep this PR focused on its original purpose.
There is still, of course, room to discuss whether this work should be done outside of zppy for similar reasons to what moves over to that discussion.
@zhangshixuan1987 Sorry for this distraction. I'm grateful that Xylar created a separate discussion item.
For the pcmdi_diags_master
the environment. It is possible to source someone's locally built conda env through commands like:
source /home/ac.forsyth2/miniconda3/etc/profile.d/conda.sh; conda activate e3sm_diags_20241015
This is how I source @forsyth2 's locally built e3sm_diags environment. I think we can do something similar for using your built pmp environment: first to source your conda path, and then activate the env.
@chengzhuzhang @forsyth2: I am not intended to interrupt the discussions. I think all discussions here are very useful to me as well. I am also very interested in the further discussions at the https://github.com/E3SM-Project/zppy/discussions/641.
@chengzhuzhang : Hi Jill, I have checked and made modifications to the PCMDI workflow. The test on my side indicates that the workflow can work as expected.
Below are the example results for the E3SM AMIP simulation run with revised workflow:
An updated ".cfg" file is attached here for your test:
Also, I think that my built "pcmdi_metrics_master" environment can not be sourced as like @xylar's likely because I did not install the miniconda environment by myself, rather, I relied on the "e3sm_unified" environment on Chrysalis. Therefore, to make sure that you can test the package successfully, I converted my package to a ".yml" file:
pcmdi_metrics_master.yml I think you can use this file to build an equivalent environment for test purposes (I apologize for the extra work on your side).
Finally, I am trying to work on a viewer function following the one in E3SM diagnostics, I will let you know if I can quickly figure this out. Please let me know if you could be available for a meeting, and I would like to discuss with you for the strategy and the requirements that I should follow.
Issue resolution
Select one: This pull request is...
1. Does this do what we want it to do?
Objectives:
Required:
If applicable:
2. Are the implementation details accurate & efficient?
Required:
If applicable:
zppy/conda
, not just animport
statement.3. Is this well documented?
Required:
4. Is this code clean?
Required:
If applicable: