GEOS-ESM / GEOSana_GridComp

Repository containing code for data analysis for the GEOS Earth System Model data assimilation
Apache License 2.0
1 stars 3 forks source link

Procedure to remove duplicates from analysis breaks code #149

Closed rtodling closed 2 hours ago

rtodling commented 1 year ago

Unfortunately, I spent the entire day today fighting a problem when trying get nc diag files out of the latest 5.30.3 GSI. I pinned the problem to the handling of duplicate removal introduced recently. I am able to get the code to behave as before by resetting the identical_obs array to false right before it gets used to bypass duplicate obs.

I believe the problem is that the way the code is done it leaves entries in the diag arrays filled w/ garbage and the counting done by the link list ends up corrupt.

I am not sure how the binary diags don't seem to get affected - I believe they are actually affected but somehow things get clobbered in a way that the code still more or less works - but I think the obs sequence might be messed up.

Anyway, as is in 5.30.3 the code simply crashes when netcdf diag files are requested; so the code cannot stand as is.

gmao-jjin3 commented 1 year ago

I am sorry to know it took you so much time. I am going to take a look and see if I can spot anything. On the other hand, it doesn't hurt to by-pass this minor feature.

rtodling commented 1 year ago

I did put code to bypass this ... all worked for me last night. I created IODA files for 4 sync times; I ran jedi.x successfully w/ the first sync time; than jedi crashes w/ a profile check error in the second sync time ... the bug seems to be moving for me. I don't know if's an env thing ... I am still looking into it.

rtodling commented 1 year ago

JJ: don't spend time of this ... I am not quite sure it is an actual bug ... I'm now able to run fine it seems!

gmao-jjin3 commented 1 year ago

Hi Ricardo, thanks for letting me know. I did try by myself and my test crashed too. But, it was not because of that. I was able to create nc_diag files. It was probably because FVSETUP messed up. It set up lat-long grids for AGCM even though I typed "C360" for the input of AGCM horizontal resolution. Here are my original /archive/u/jjin3/x49atms/run/AGCM.rc.tmpl and my saved FVSETUP input /discover/nobackup/projects/gmao/obsdev/jjin3/SavedInputs/x49atms.input . It needs a fix if you can verify the issue.

rtodling commented 2 hours ago

Hi Ricardo, thanks for letting me know. I did try by myself and my test crashed too. But, it was not because of that. I was able to create nc_diag files. It was probably because FVSETUP messed up. It set up lat-long grids for AGCM even though I typed "C360" for the input of AGCM horizontal resolution. Here are my original /archive/u/jjin3/x49atms/run/AGCM.rc.tmpl and my saved FVSETUP input /discover/nobackup/projects/gmao/obsdev/jjin3/SavedInputs/x49atms.input . It needs a fix if you can verify the issue.

FVSETUP doesn't do that ... something might have been wrong in some other settings when your experiment was prepared - and only you would know how that was done!

rtodling commented 2 hours ago

I am closing this issue since JJ is no longer answering to the git-hub account.