Open guoqing-noaa opened 2 months ago
@guoqing-noaa If there is a test domain available, it will help us to find the potential MPAS-JEDI issues at the early stage. But please don't slow down the development of cycled MPAS-JEDI DA.
I have a 3km domain available. If you think it is big enough to test, we can use it.
For these new domains, do we want to use the new MPAS code (8.2.1) and use mpasout as the model output and DA backgound?
For these new domains, do we want to use the new MPAS code (8.2.1) and use mpasout as the model output and DA backgound?
Good question. Let's keep this for our next step. For now, let's repeat the previous steps using the previous MPAS code. Thanks!
@Junjun-NOAA That domain seems good to me as a conus domain run. What are your opinions @ShunLiu-NOAA @guoqing-noaa
@TingLei-NOAA for sanity check of the performance of high resolution JEDI analysis, it is a good way to try. For cycled DA test, we may start with low resolution test.
@TingLei-NOAA Good with me as well. @Junjun-NOAA 's case uses init.nc for ensembles, so it will not be able to test letkf.
Jake told me that he had successfully tested mpasjedi for 10+ million cells recently, so it looks like we may run NA_3km mpasjedi cases now. @chunhuazhou is working on generating an NA_3km case.
@ShunLiu-NOAA @guoqing-noaa Thanks. So, this 3km conus case would be good for my one cycle variational test of mgbf . @Junjun-NOAA would you please point me to your case and I can run it on hera? Thanks.
@TingLei-NOAA Here is my run directory on Hera : /scratch1/BMC/wrfruc/jjhu/rundir/RDASApp/expr/mpas_2024052700_3km Please let me know if you have any questions or comments. Thanks
@Junjun-NOAA Thank you so much. I will keep you updated on how things are going.
@chunhuazhou Could you update your progress here? General information, the error information will help. Thanks!
As discussed at the RRFS developers' meeting, I am providing some updates here on my attempt to create a NA 3km domain. Step 1: using create_region to generate na3km.grid.nc by clipping from the global 3km grids. This can be done using bigmem partition in a slurm job and can take a few hours. Step 2: run init_atmosphere_model to generate na3km.static.nc This requires a lot of memory for the NA 3km. I was trying to use 60 nodes on kjet and still got OOM failures: "slurmstepd: error: Detected 1 oom_kill event in StepId=9418345.0. Some of the step tasks have been OOM Killed."
@chunhuazhou Thanks for sharing! Seems now the problem is with the processing tool from MPAS first. Right? Which machine are you using ? Will simple increasing nodes number work?
@TingLei-NOAA The problem is that with so many cells for the NA3km domain, it requires a lot of memory to run MPAS. My step 2 is only running init_atmohsphere_model, which requires much less resources than the MPAS model forecast itself. I am testing it on jet. I will try to increase the number of nodes and see how many nodes can work for the NA3km.
@chunhuazhou Could you try it on Gaea? It has much more memory.
@guoqing-noaa I haven't tried MPAS on Gaea yet. Have you tried it? Do we have all the required modules there?
@chunhuazhou Let me create one module file for you.
@chunhuazhou
Use the following command to load modules need by the MPAS model:
source /gpfs/f5/ufs-ard/world-shared/gge/c5_intel.sh
I succesfully compiled mpas.
@guoqing-noaa Thanks so much for the modules! Do you happen to have the WPS_GEOG files on Gaea? I am about to download it from NCAR website but if you already have them or if you know where I can find them, I can skip the download. Thanks!
@guoqing-noaa Thanks so much for the modules! Do you happen to have the WPS_GEOG files on Gaea? I am about to download it from NCAR website but if you already have them or if you know where I can find them, I can skip the download. Thanks!
Does it take a very long time? If yes, I can transfer a copy for you.
@guoqing-noaa MPAS compiled successfully. Thanks for the modules! Downloading WPS_GEOG files should not take too long, I think. Thanks!
Adding some updates here, after trying both Jet and Gaea, I found out the issue couldn't be fixed by increasing resources. Instead, adding one namelist entry config_gwd_cell_scaling=1.1
to &preproc_stages
fixed the failure of running init_atmosphere_model to generate na3km.static.nc. The namelist option config_gwd_cell_scaling
is the scaling factor for the effective grid cell diameter used in computation of GWD static fields (default value is 1.0).
Here is the NA3km domain that is working now:
Associated numbers for this domain are as follows:
dimensions:
nCells = 10591561 ;
nVertices = 21195314 ;
nEdges = 31786874 ;
na3km.custom.pts
Name: na3km
Type: custom
Point: 54.0, -106.0
75.0, 150.0
75.0, -30.0
5.0, -60
5.0, 180.0
I have the mpasout file for the deterministic run but the generation of the ensemble mpasout files will take more time - I will update once they are ready. Please let me know what you think - do we need to increase the domain size? I do have another task of running create_region for a larger domain and can run mpas if needed. Thanks!
Great progress! Thanks, @chunhuazhou!
When you generate ensembles, could you use mem001 instead of mem01? My PR #161 has used mem001.
@guoqing-noaa I will! Thanks!
@chunhuazhou Great to know this moves forward. Thanks!
From what I see in this issue, I think CONUS 3km grid is enough for our current development and test. I will say we current focus on CONUS 3km grid only. Please let me know if I missed anything here on NA3km tests.
@hu5970 Thanks Ming for your input! I want to add here that for NA3km tests, I have been having issues moving the model forecast forward, with the model producing unreasonably huge wind near the SE lateral boundaries and then blowing out very quickly.
@chunhuazhou Thanks for your effort on this.
My suggestions:
@guoqing-noaa Regarding your suggestions: I already posted a thread at MPAS forum at https://forum.mmm.ucar.edu/threads/mpas-model-forecast-stopped-right-after-hour-0-segmentation-fault.19174/#post-46491 I do have one mpasout files at 23Z 05/26/2024 at /lfs5/BMC/wrfruc/Chunhua.Zhou/nco/stmp/na3km/1.0.1/rrfs.20240526/23/fcst/mpasout.2024-05-26_23.00.00.nc if anybody wants to try it out.
Thanks, @chunhuazhou !
It is good for try the big NA 3km domain. But let's focus on more realistic CONUS 3km for initial test and evaluations.
@ShunLiu-NOAA @TingLei-NOAA @SamuelDegelia-NOAA
Do we plan to run LETKF/GLETKF on these two domains?