hist_avg on multiple streams writes the same filenames when .false.

dabail10 commented 1 year ago

This was introduced when the hist_avg variable was modified to set the behaviour on each stream.

phil-blain commented 1 year ago

As far as I understand, this is not a behaviour that was introduced or changed when hist_avg was made per-stream. It was always this way, because the filename for instantaneous outputs does not depend on the frequency.

dabail10 commented 1 year ago

Yes and no. Originally when I introduced the multiple stream thing, I had forgotten about hist_avg. So, the behaviour at that time was make all of the streams instantaneous or all averaged. When I added the multiple stream option for hist_avg, I forgot that the filename was fixed for hist_avg = .false.

apcraig commented 1 year ago

We have tests that are actually part of our test suite, histall has

histfreq       = 'm','d','1','h','x'
histfreq_n     =  1,2,6,4,1
f_CMIP         = 'm'
f_aice         = 'md1h' 
f_hi           = 'h1dm'
f_hs           = 'd1m' 
f_Tsfc         = 'mdh' 
f_sice         = 'md' 
f_uvel         = 'md' 
f_vvel         = 'dm' 
f_uatm         = 'dm' 
...

when running with hist_avg(:) = true (cheyenne_intel_restart_gx3_32x1_debug_histall_ionetcdf), we get files like

every 6 timesteps: -rw-r--r-- 1 tcraig ncar 2560852 Nov 25 04:18 iceh.2005-01-03-00000.nc -rw-r--r-- 1 tcraig ncar 2560852 Nov 25 04:18 iceh.2005-01-03-21600.nc -rw-r--r-- 1 tcraig ncar 2560852 Nov 25 04:18 iceh.2005-01-03-43200.nc

every 2 days: -rw-r--r-- 1 tcraig ncar 16763992 Nov 25 04:18 iceh.2005-01-02.nc -rw-r--r-- 1 tcraig ncar 16763992 Nov 25 04:18 iceh.2005-01-04.nc

every 4 hours -rw-r--r-- 1 tcraig ncar 2560856 Nov 25 04:18 iceh_04h.2005-01-02-57600.nc -rw-r--r-- 1 tcraig ncar 2560856 Nov 25 04:18 iceh_04h.2005-01-02-72000.nc -rw-r--r-- 1 tcraig ncar 2560856 Nov 25 04:18 iceh_04h.2005-01-03-00000.nc -rw-r--r-- 1 tcraig ncar 2560856 Nov 25 04:18 iceh_04h.2005-01-03-14400.nc -rw-r--r-- 1 tcraig ncar 2560856 Nov 25 04:18 iceh_04h.2005-01-03-28800.nc

no monthly files (run too short).

And I have confirmed that the fields on these files is consistent with the settings in ice_in. The "d" files have lots and lots of fields, "h" and "1" just have 3 fields each (slightly different though).

With hist_avg(:)=.false. (cheyenne_intel_restart_gx3_32x1_debug_histall_histinst_ionetcdf), we get

-rw-r--r-- 1 tcraig ncar 2560504 Nov 25 04:18 iceh_inst.2005-01-03-43200.nc -rw-r--r-- 1 tcraig ncar 2560504 Nov 25 04:18 iceh_inst.2005-01-03-57600.nc -rw-r--r-- 1 tcraig ncar 2560500 Nov 25 04:18 iceh_inst.2005-01-03-64800.nc -rw-r--r-- 1 tcraig ncar 2560504 Nov 25 04:18 iceh_inst.2005-01-03-72000.nc -rw-r--r-- 1 tcraig ncar 2560504 Nov 25 04:18 iceh_inst.2005-01-04-00000.nc -rw-r--r-- 1 tcraig ncar 2560504 Nov 25 04:18 iceh_inst.2005-01-04-14400.nc -rw-r--r-- 1 tcraig ncar 2560500 Nov 25 04:18 iceh_inst.2005-01-04-21600.nc -rw-r--r-- 1 tcraig ncar 2560504 Nov 25 04:18 iceh_inst.2005-01-04-28800.nc -rw-r--r-- 1 tcraig ncar 2560504 Nov 25 04:18 iceh_inst.2005-01-04-43200.nc

which I assume is the union of all files that are "every 6 timesteps", "every 4 hours", and "every 2 days". The above output filename strategy is kind of reasonable, when hist_avg=false with multiple streams, we are just creating instantaneous files at different frequencies. The question is what's on each of the files. It should also be the union of all the fields specified for that frequency, but it's not. It looks like it's just the output for one of the frequencies at that timestep.

So, I'm not sure the problem is with filenames, I kind of like that they are all instantaneous filenames as they should be. But there is an issue with content. It should be the union of fields defined on the different streams at that timestep. A solution could be to have different filenames for each stream, but would it be better for the different streams to just write to the same filename when hist_avg=false and the frequencies overlap?

apcraig commented 1 year ago

And just to followup, it looks like there are a number of issues that could (should) be addressed.

Should daily and 3-day averages both work? That's not possible if using "d"
What happens if you use h+24 and d+1 (both daily averages)?
Should the f_aice settings be by stream rather than frequency, ie f_aice = 'md' -> f_aice = '1','2' to overcome some current shortcomings.
Should we review and update the default filenames. Why does h+4 have iceh_04h but d+2 and '1'+6 not have a special "04h" string appended?

phil-blain commented 1 year ago

Thanks for this analysis Tony, this is indeed what I wanted to point out in https://github.com/CICE-Consortium/CICE/issues/854#issuecomment-1777191767 and https://github.com/CICE-Consortium/CICE/pull/912#pullrequestreview-1742868438.

With hist_avg(:)=.false. [...] The above output filename strategy is kind of reasonable, when hist_avg=false with multiple streams, we are just creating instantaneous files at different frequencies. The question is what's on each of the files. It should also be the union of all the fields specified for that frequency, but it's not. It looks like it's just the output for one of the frequencies at that timestep.

Exactly, it's the output for the frequency defined last. When several frequencies match the current time step, the fields corresponding to the first frequency are written to disk, then the second frequency are written to disk with the same filename, and since we use the clobber flag (or that's the default behaviour, I forget), the file is simply overwritten.

A solution could be to have different filenames for each stream, but would it be better for the different streams to just write to the same filename when hist_avg=false and the frequencies overlap?

With the new option hist_suffix that we just merged, it's easy to make such a setup work correctly, just by setting that variable to something different for each stream. That's why I suggested mentioning this limitation in the doc.

Having all streams write to the same filename would involve a lot of refactoring from what I quickly evaluated, since we would have to change a lot of how the code works. Since we can't write data to the NetCDF file before calling nf90_enddef, and we can't add new variables after calling nf90_enddef, much of the ice_history and ice_history_shared code would have to be reworked. I do not think it is worth it.

As for your other questions, here is what I think:

Should daily and 3-day averages both work? That's not possible if using "d"

This is a known caveat (it's noted in the doc), using the same frequency twice does not work currently.

What happens if you use h+24 and d+1 (both daily averages)?

I think this works correctly, I don't see why it should not (since with averages, the filenames are per-stream).

Should the f_aice settings be by stream rather than frequency, ie f_aice = 'md' -> f_aice = '1','2' to overcome some current shortcomings.

If we change the code so that we can have several streams with d (or h, etc.), then I guess we will need to do that. But if we don't I don't see why we should change the syntax. I think the current syntax is pretty clear.

Should we review and update the default filenames. Why does h+4 have iceh_04h but d+2 and '1'+6 not have a special "04h" string appended?

That's a good question. We could break people's post-processing infrastructure if we change this, so it's something to think about. I think the current state evolved organically and that might be why we have the lack of consistency you point out. I personally do not think it's a big deal.

apcraig commented 12 months ago

@phil-blain, I largely agree. I'm not sure we need to change things now. There are some shortcomings in the current implementation, but the community isn't asking for something more. Just throwing those ideas out there.

Is the solution to just document that hist_suffix should be used when multiple history output streams have hist_avg=.false? Or should the code check that hist_suffix is set when needed and abort if not? Or should the code set hist_suffix to something unique when the user hasn't but it needs to be?

Are there any cases with hist_avg=true where the history filenames will be identical for multiple streams? We would want to avoid that situation as well.

phil-blain commented 12 months ago

Is the solution to just document that hist_suffix should be used when multiple history output streams have hist_avg=.false? Or should the code check that hist_suffix is set when needed and abort if not? Or should the code set hist_suffix to something unique when the user hasn't but it needs to be?

In my opinion, just documenting it would be the minimum. Aborting if hist_suffix is not set with several instantaneous streams might be useful. I think we are moving away from the code changing the config under the user's feet.

Are there any cases with hist_avg=true where the history filenames will be identical for multiple streams?

I don't think so, from looking at construct_filenames, (unless one uses the same frequency several times, which is already noted in the docs as not working).

dabail10 commented 12 months ago

I think it is a relatively easy fix. We just do the same thing as we do for the average files, but add '_inst' to them. There are currently cases with the average files where if you had two streams with the same or similar histfreq_n and histfreq combination, the files would be the same name. As @phil-blain said, this can be handled with hist_suffix. However, I guess we should be careful not to break some others post-processing.

Maybe something like:

if (.not.hist_avg(ns)) cstream = cstream//'_inst'

We would have to be careful when histfreq = '1'.

Dave

apcraig commented 12 months ago

Thanks @dabail10, I'm not sure it makes sense to follow the same naming convention for hist_avg = false and true. With true, we are truly generating something like daily or multi-daily or month average file so a filename like case.2001-01.nc makes some sense. For hist_avg=false, it doesn't though (IMO). Monthly streams with hist_avg=false just means instantaneous output at a particular timestep of the month. case_inst.2001-01-01-00000.nc is a much more appropriate name than case_inst.2001-01.nc for that output stream (IMO). Yes, it solves some of our filename issues, but not necessarily in a good way.

Rather than change any filenames at this point, I'd add check to make sure we're getting unique filenames for each stream. And the code should abort if not and tell the user to use hist_suffix.

This could either be done during initialization (looking at the history stream namelist) or as the history files are written (track names during runtime) or a bit of both. It also worries me that average streams could have the same filenames in special cases. I'll try to have a closer look at the current implementation and try to propose some checks we could add.

CICE-Consortium / CICE

hist_avg on multiple streams writes the same filenames when .false. #915