Open mathomp4 opened 2 years ago
CC @weiyuan-jiang, @tclune, @aoloso in re the regrid refactoring effort
Adding the ability for interp_restarts.x to handle these new cases an ungridded dimension + level is straightforward from a scientific perspective (just treat each index of the ungridded dimension as a 3-D variable in it's own right and do the usual horizontal and vertical regridding. However, the code has gotten to the point that to add new capabilities like this calls for a refactoring. At least splitting out the binary and netcdf into separate codes so I can focus on adding these capabilities to the netcdf restarts without breaking the binary path. Splitting this apart and adding new capabilities is a non-trivial exercise so lead time and dedicated time to do this will be necessary.
@mmanyin @pcolarco It had been a long time since I looked at the code; I split interp_restarts.x into a binary and NetCDF code. That was fairly straightforward. Adding support for these 4-D variables in the NetCDF version of the program is actually more straightforward than I thought. Although this will mean a multiple repo PR for regrid.pl and the underlying regridding ...
@mmanyin @pcolarco @gmao-jstassi @mathomp4 I've updated interp_restart.x to handle these new restarts (as well as refactoring and spliting interp_restart.x into a separate binary and netcdf version to make my life easier going forward). I've confirmed it works with the new gocart2g restarts that are either 2d only or 4d with the unknown_dim + level. I've made contingent PR's for these in the FV3 and fvdycore repos.
The issue now is regrid.pl itself. It needs to know about the restart names but I think we are getting to the point where it needs some more flexibility. Since each species in gocart and each instance can have it's own restart like cabc_internal_rst cabr_internal_rst caoc_internal_rst adding these explicitly in regrid.pl seems problematic. What is somone else has other instances of ca for example? It seems like regrid.pl needs to be able to do some sort of wild carding. Like instead of adding these explicitly, you had wild card like this: ca*_internal_rst and it finds any restarts of that form. Pinging Joe in this as my perl is shaky to do this.
Does this only affect the input name ? Is there any special option? For the future, the python regrider only cares about the names that are listed in the yaml file.
Sounds like good progress. Will it eventually be possible to regrid from the old GOCART 1G format to the GOCART 2G?
I would second @mmanyin. I think we would want to be able to regrid to GOCART2G from MERRA-2 for instance.
Sounds like good progress. Will it eventually be possible to regrid from the old GOCART 1G format to the GOCART 2G?
@mmanyin Can you elaborate what that would entail or what that even means? Is it just a matter of splitting an old gocart 1g restart into separate restarts?
That's beyond the scope of what the underlying regridding code (interp_restarts.x) would handle, it just regrids what is there already.
It would have to be some other script but someone who understands this would have to write it or give me a precise recipe for what that operation would entail.
I would second @mmanyin. I think we would want to be able to regrid to GOCART2G from MERRA-2 for instance. @pcolarco what does this mean? Would every field in a gocart 1g restart (merra2 or otherwise) go to a specific field in a specific gocart2 split restart? If not, I'm not sure what this operation of going from GOCART1G to GOCART2G means.
As I said, I will not support this in interp_restarts.x, that's outside the scope of that program. If is as simple as spliting the fields into separate restarts that is another programs job, it would be a trivial python script.
As far as MERRA2 This is a can of worms. Regridding directly from MERRA2 to GOCART2G is complicated by the fast that MERRA2 was binary, so there is no metadata in the file. I painstakingly figured out the order of the fields in the MERRA2 binary restarts a long time ago and wrote a converter here to convert them from binary to NetCDF using descriptor files that document the variable order in each file:
https://github.com/bena-nasa/GEOS5_restart_converter
If you want to do something with MERRA2 in this hypothetical GOCART1G to GOCART2G operation, you would need to use my tool to convert this to NetCDF then, if there is a solution to go from GOCART1G to GOCART2G with the NetCDF file.
I understand. A boy can dream. Splitting a netcdf legacy GOCART to GOCART2G should be straightforward with NCO or some other tool.
@bena-nasa I have been impressed in the past when needing to convert legacy restarts to a more recent version, that regrid.pl could identify what was missing and provide at least a bare bones set of restarts. Without really knowing what the limitations are, I was posing the general question -- can the program generate a set of G2G restarts that approximates an older G1G set? Sounds like this is not the tool for doing it.
@bena-nasa I have been impressed in the past when needing to convert legacy restarts to a more recent version, that regrid.pl could identify what was missing and provide at least a bare bones set of restarts. Without really knowing what the limitations are, I was posing the general question -- can the program generate a set of G2G restarts that approximates an older G1G set? Sounds like this is not the tool for doing it.
No, regrid.pl just calls other program that regrid the restarts that are there using the boundary conditions it thinks are appropriate based on your answers to the questions; no more, no less. If you have a gocart_internal_rst in, you get one out. After I fix up regrid.pl, if you have a set of restarts from gocart 2g in you get a set out.
@mmanyin @pcolarco This does bring up point, what are all the "base" component names it need to be aware of. I'm going to try to implement a wild card feature in regrid.pl So looking in the restarts pete provided we have ca,ss,du,ni,su and these could potentially have multiple instances (in the restarts I have only ca actually does) that I will use the wildcard feature to find. What am I missing if any? I'll code it to the list above so please let me know if I need to include others.
@bena-nasa Your list looks complete for GOCART2G, although note it is "cabr," "cabc," and "caoc". You might anticipate an eventual refactoring of the remainder of the legacy GOCART which would split what is still left in gocart_interal_rst into subsequent things like co, co2, ch4, ... _interal_rst.
@bena-nasa Your list looks complete for GOCART2G, although note it is "cabr," "cabc," and "caoc". You might anticipate an eventual refactoring of the remainder of the legacy GOCART which would split what is still left in gocart_interal_rst into subsequent things like co, co2, ch4, ... _interal_rst.
@pcolarco Oh, did misread that? I thought, cabr, cabc, caoc were separate instances of one species, but now I'm reading again, that is just brown carbon, black carbon, and organic carbon. So per species there is only one restart no matter how many instances? If so then I can just hard code the names and I was making a problem out of nothing.
@bena-nasa Hmm... Each instance has its own restart. But default we are running three carbonaceous instances: brown, black, and organic carbon. We need to be able to regrid each such instance. Some guidance then for how to handle multiple instances for later (i.e., so far not tried out) cases will be helpful. Does that make sense?
@pcolarco @jstassi I was only asking about the instances because currently the way regrid.pl works is that is has hard coded restart names it looks for. So if the name is not in the list, it won't regrid it. So this could be a problem if someone runs a new gocart case with multiple instances for example and wants to regrid those but the script is unaware. I see two solutions possible solutions for this in regrid.pl
It sounds like for now the PR I've made handles the current uses but still needs another extension. Any thoughts on which method sounds better as an end user?
@bena-nasa Hey, Ben, is this functionality available now in some more recent model version for me to try out? Thanks
@pcolarco I think it should be in GEOSgcm v10.21.0 for sure.
Thanks, Matt. I did go check out and see it in the CHANGLOG, so I’ll give it a try.
Peter Colarco NASA GSFC Code 614 NASA Goddard Space Flight Center Greenbelt, MD 20771 301.614.6382 (ph) 301.614.5903 (fax)
From: Matthew Thompson @.> Reply-To: GEOS-ESM/GMAO_Shared @.> Date: Thursday, January 13, 2022 at 4:23 PM To: GEOS-ESM/GMAO_Shared @.> Cc: Peter Colarco @.>, Mention @.***> Subject: [EXTERNAL] Re: [GEOS-ESM/GMAO_Shared] Update regrid.pl for GOCART2G (Issue #228)
@pcolarcohttps://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpcolarco&data=04%7C01%7Cpeter.r.colarco%40nasa.gov%7C384d5c0d076d4e1f481408d9d6dae82a%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637777057950609750%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=GYFupCgFRv8XC%2FlQgjksCn4toJvjeiqXjp4781rEgzw%3D&reserved=0 I think it should be in GEOSgcm v10.21.0 for sure.
— Reply to this email directly, view it on GitHubhttps://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FGEOS-ESM%2FGMAO_Shared%2Fissues%2F228%23issuecomment-1012526119&data=04%7C01%7Cpeter.r.colarco%40nasa.gov%7C384d5c0d076d4e1f481408d9d6dae82a%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637777057950609750%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=SwUuh6lSXjRW3fBbE1I69BbIm2LgR4gJ%2FMBNAT2YejA%3D&reserved=0, or unsubscribehttps://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FANW73YARKTWE3AI2GMKBZ6LUV47D7ANCNFSM5HI7R2DQ&data=04%7C01%7Cpeter.r.colarco%40nasa.gov%7C384d5c0d076d4e1f481408d9d6dae82a%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637777057950609750%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=DpoQuJ8CFnDykS4zhy0azsRTcbdTcLlEpOhcENZlTiQ%3D&reserved=0. Triage notifications on the go with GitHub Mobile for iOShttps://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7Cpeter.r.colarco%40nasa.gov%7C384d5c0d076d4e1f481408d9d6dae82a%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637777057950609750%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=SBkW7lMnsrY5TxL15rtpNmQpbqrjXuDzM87OonR%2BjlI%3D&reserved=0 or Androidhttps://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26referrer%3Dutm_campaign%253Dnotification-email%2526utm_medium%253Demail%2526utm_source%253Dgithub&data=04%7C01%7Cpeter.r.colarco%40nasa.gov%7C384d5c0d076d4e1f481408d9d6dae82a%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637777057950609750%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=7gD5uyZMXBg1J1xkkIG1G7%2Bhlz2hGml2yGXU9%2F8PUIQ%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.***>
Per @pcolarco and @mmanyin in email, there is a desire to regrid the new GOCART2G restarts. Looking at a set of them, the (possible) new files seem to be:
achem_internal
cabc_internal
cabr_internal
caoc_internal
du_internal
hemco_internal
ni_internal
ss_internal
su_internal
Now, normally we could just add these to The List™ that's in
regrid.pl
but it's not that simple.I looked at these restarts with @bena-nasa and we noticed that:
hemco_internal
is a 2D restart (no levels)du_internal
andss_internal
are 4D restarts (ungridded dims)The main issue is the underlying regridding code was not set up for files like these.
So, I suppose the first question for @pcolarco or @amdasilva or @christophkeller is: Do we need to worry about
hemco_internal
? If not, we just just always bootstrap it and focus on the 4d restarts?