Closed rmshkv closed 2 months ago
Also, I don't love how the warnings show up currently - you end up with a large block of text mixed in with all the Ploomber stuff and the unhelpful warnings.warn()
... code used to generate the warnings gets printed out too. Please feel free to propose a better way if you know one!
Hi @rmshkv , I'll look at this PR in more detail, but wanted to post a quick note as I am realizing that there will probably be some conflicts between this PR and #78. I think that's fine, and I can update #78 if this comes in first. We may also want to discuss whether the component flags also result in timeseries being run for only the specified components (eg, if you run cupid-run config.yml -ts -atm
, that would generate timeseries and run the notebooks for atm
only). I think this may be ideal and still seems clear to me, but I'm open to other thoughts there.
Hi @rmshkv , I'll look at this PR in more detail, but wanted to post a quick note as I am realizing that there will probably be some conflicts between this PR and #78. I think that's fine, and I can update #78 if this comes in first. We may also want to discuss whether the component flags also result in timeseries being run for only the specified components (eg, if you run
cupid-run config.yml -ts -atm
, that would generate timeseries and run the notebooks foratm
only). I think this may be ideal and still seems clear to me, but I'm open to other thoughts there.
Ah thanks for the heads up, I didn't realize how far along the timeseries generation work was! I think there are several different ways we could integrate these features...potentially even something like including the timeseries generation code for each component under compute_scripts
, which already get turned on or off by the component flags but could then also be affected by the -ts
flag? Maybe we can chat about it at the meeting tomorrow.
Most of my comments on this pass-through are related to reducing the amount of duplicated code. There is still a lot of similarities between compute_notebooks
and compute_scripts
sections, but cleaning that up is probably best left for a separate issue
Hi @rmshkv , I'll look at this PR in more detail, but wanted to post a quick note as I am realizing that there will probably be some conflicts between this PR and #78. I think that's fine, and I can update #78 if this comes in first. We may also want to discuss whether the component flags also result in timeseries being run for only the specified components (eg, if you run
cupid-run config.yml -ts -atm
, that would generate timeseries and run the notebooks foratm
only). I think this may be ideal and still seems clear to me, but I'm open to other thoughts there.
I just implemented this (-ts and component flags will only run those components). There's a tiny bit of extra code because the timeseries block references components as "atm", "ocn", etc and the notebooks/scripts blocks reference them as "atmosphere", "ocean", etc but it works fine and we can address it later, in the interest of getting this PR in.
Hi @rmshkv , I'll look at this PR in more detail, but wanted to post a quick note as I am realizing that there will probably be some conflicts between this PR and #78. I think that's fine, and I can update #78 if this comes in first. We may also want to discuss whether the component flags also result in timeseries being run for only the specified components (eg, if you run
cupid-run config.yml -ts -atm
, that would generate timeseries and run the notebooks foratm
only). I think this may be ideal and still seems clear to me, but I'm open to other thoughts there.I just implemented this (-ts and component flags will only run those components). There's a tiny bit of extra code because the timeseries block references components as "atm", "ocn", etc and the notebooks/scripts blocks reference them as "atmosphere", "ocean", etc but it works fine and we can address it later, in the interest of getting this PR in.
On second thought, maybe we do want them to match...I'll make everything use the short names, and we can change that later if we decide to.
Hi @rmshkv , I'll look at this PR in more detail, but wanted to post a quick note as I am realizing that there will probably be some conflicts between this PR and #78. I think that's fine, and I can update #78 if this comes in first. We may also want to discuss whether the component flags also result in timeseries being run for only the specified components (eg, if you run
cupid-run config.yml -ts -atm
, that would generate timeseries and run the notebooks foratm
only). I think this may be ideal and still seems clear to me, but I'm open to other thoughts there.I just implemented this (-ts and component flags will only run those components). There's a tiny bit of extra code because the timeseries block references components as "atm", "ocn", etc and the notebooks/scripts blocks reference them as "atmosphere", "ocean", etc but it works fine and we can address it later, in the interest of getting this PR in.
On second thought, maybe we do want them to match...I'll make everything use the short names, and we can change that later if we decide to.
If you haven't made the change yet, I'd prefer changing the time series to use the longer names... but if you are already using the short names we can clean that up in a future PR :)
Hi @rmshkv , I'll look at this PR in more detail, but wanted to post a quick note as I am realizing that there will probably be some conflicts between this PR and #78. I think that's fine, and I can update #78 if this comes in first. We may also want to discuss whether the component flags also result in timeseries being run for only the specified components (eg, if you run
cupid-run config.yml -ts -atm
, that would generate timeseries and run the notebooks foratm
only). I think this may be ideal and still seems clear to me, but I'm open to other thoughts there.I just implemented this (-ts and component flags will only run those components). There's a tiny bit of extra code because the timeseries block references components as "atm", "ocn", etc and the notebooks/scripts blocks reference them as "atmosphere", "ocean", etc but it works fine and we can address it later, in the interest of getting this PR in.
On second thought, maybe we do want them to match...I'll make everything use the short names, and we can change that later if we decide to.
If you haven't made the change yet, I'd prefer changing the time series to use the longer names... but if you are already using the short names we can clean that up in a future PR :)
I made the change just a minute ago, but it's not too hard to change back later. I also prefer the long names for clarity, but I don't know much about what the timeseries code is doing and it looks like some of the directories it's creating might depend on the component shortname, so I didn't want to touch that myself.
All right, I think that addresses everything requested! Feel free to run some tests again and let me know what you think.
Hi @rmshkv , I'll look at this PR in more detail, but wanted to post a quick note as I am realizing that there will probably be some conflicts between this PR and #78. I think that's fine, and I can update #78 if this comes in first. We may also want to discuss whether the component flags also result in timeseries being run for only the specified components (eg, if you run
cupid-run config.yml -ts -atm
, that would generate timeseries and run the notebooks foratm
only). I think this may be ideal and still seems clear to me, but I'm open to other thoughts there.I just implemented this (-ts and component flags will only run those components). There's a tiny bit of extra code because the timeseries block references components as "atm", "ocn", etc and the notebooks/scripts blocks reference them as "atmosphere", "ocean", etc but it works fine and we can address it later, in the interest of getting this PR in.
On second thought, maybe we do want them to match...I'll make everything use the short names, and we can change that later if we decide to.
If you haven't made the change yet, I'd prefer changing the time series to use the longer names... but if you are already using the short names we can clean that up in a future PR :)
I made the change just a minute ago, but it's not too hard to change back later. I also prefer the long names for clarity, but I don't know much about what the timeseries code is doing and it looks like some of the directories it's creating might depend on the component shortname, so I didn't want to touch that myself.
Just as an FYI, the time series code isn't super dependent on how the components are named; There's a comment in line 285 that would need to be updated, as well as just the names in lines 329, 52, and 232. I'm fine with keeping the short names for now, though.
All the tests I've run are looking good! I do also agree that updated the logic regarding 'all' is improved when the assumption is that 'all' is the default unless one component is specified, and a flag is not needed to specify 'all'. I think that the lack of a flag clarifies that 'all' is default instead of potential confusion with a default that can also be specified with a flag.
The code in run.py
looks great and cupid-run config.yml
generates output as expected, but cupid-build
isn't getting the sidebar right anymore:
I was hoping the sidebar would have the name of all the components, and then links to each of the component's notebooks underneath.
I really like how this turned out:
(cupid-dev) examples/coupled_model$ cupid-run -atm -glc config.yml
cupid/run.py:152: UserWarning: No notebooks for glc component specified in config file.
(And adf_quick_run.ipynb
was still executed).
The code in
run.py
looks great andcupid-run config.yml
generates output as expected, butcupid-build
isn't getting the sidebar right anymore:
I was hoping the sidebar would have the name of all the components, and then links to each of the component's notebooks underneath.
I really like how this turned out:
(cupid-dev) examples/coupled_model$ cupid-run -atm -glc config.yml cupid/run.py:152: UserWarning: No notebooks for glc component specified in config file.
(And
adf_quick_run.ipynb
was still executed).
My bad, I just forgot to change the explicit paths in the jupyter book config with the new short folder names. Should be fixed now!
My bad, I just forgot to change the explicit paths in the jupyter book config with the new short folder names. Should be fixed now!
After updating to f114b83 I needed to rerun cupid-run
and cupid-build
, but the page looks great now!
Addresses https://github.com/NCAR/CUPiD/issues/63.
This adds a layer of keys in
config.yml
that groups notebooks (and scripts) by ESM component, e.g.atmosphere
,ocean
, etc. These keys are referenced by flags that can be passed in tocupid-run
, e.g.-atm
,-ocn
, etc., as specified in the README. If no component flags are passed in, all components are run. This can also be done explicitly by using--all
or-a
.This update also prompted changes to how we're managing the environment (kernel) checking step that was being done by
util.get_control_dict()
called bycupid-run
. We still check the existence of all environments specified inconfig.yml
(regardless of whether that component is turned on by flags), but just raise a warning if the environment does not exist. If that notebook is specified to be run, another warning is raised and that notebook is not run, but the others still are. Also, if neither adefault_kernel_name
nor a notebook-specifickernel_name
is provided,cupid-analysis
is assumed and another warning is raised.