NCAR / DART

Data Assimilation Research Testbed
https://dart.ucar.edu/
Apache License 2.0
187 stars 140 forks source link

bug: GOES converter does not handle multiple input files #579

Open nancycollins opened 9 months ago

nancycollins commented 9 months ago

Describe the bug

I don't have any GOES test files, but based on a user question i looked at the code. this section in observations/obs_converters/GOES/convert_goes_API_L1b.f90 concerned me:

! for each input file do ifile=1, filecount ! read from netCDF file into a derived type that holds all the information call goes_load_ABI_map(l1_files(ifile), map)

! convert derived type information to DART sequence call make_obs_sequence(seq, map, lon1, lon2, lat1, lat2, & x_thin, y_thin, goes_num, reject_dqf_1, obs_err, & vloc_pres_hPa)

! write the sequence to a disk file call write_obs_seq(seq, outputfile)

if (verbose) call print_obs_seq_summary(seq,l1_files(ifile))

! release the sequence memory call destroy_obs_sequence(seq) enddo

make_obs_sequence() initializes a new sequence each time through the loop and the sequence is destroyed after the write. there is only a single outputfile string, but the input can be a list. this appears that it will overwrite the same output filename if you specify more than one input file, and at the end you'll have a sequence with only the obs from the last GOES file in the list.

Version of DART

current version

Have you modified the DART code?

no

What I expect

almost all obs converters that take multiple input filenames concatenate them into a single output file that contains obs from all the input files. this one should make a new sequence for the first file in the list; for subsequent files it should concatenate them onto that same sequence; then destroy the sequence outside of the filecount loop.

nancycollins commented 9 months ago

if i'm not completely confused about how this works, the missing code is some variation on this:

inquire(file=outputfile, exist=file_exist)

if ( file_exist ) then

! existing file found, append to it adding space for num_new_obs call read_obs_seq(outputfile, 0, 0, num_new_obs, seq)

else

! create a new one call init_obs_sequence(seq, num_copies, num_qc, num_new_obs) call set_copy_meta_data(seq, 1, 'observation') call set_qc_meta_data(obs_seq, 1, 'Data QC')

endif

num_new_obs should be the max possible based on the number of files and the max obs per file. when the seq is written it will only write out what's actually used.