cardat / air-health-bushfire-smoke-netcdf

Software to share australian bushfire smoke data funded by CAR and ARDC. Supported by CurtinIC and ASDAF
MIT License
1 stars 0 forks source link

merge_clean.sh does not fail safely when missing time slices #6

Closed ivanhanigan closed 2 years ago

ivanhanigan commented 2 years ago

gives a message: Warning: Input stream 1 has 7121 timesteps. Stream 5 has more timesteps, skipped! and the incorrect results

I discovered that some dates were missing from pm25_pred. I presume from the code that generated the geotifs and uploaded to cloudstor. I was getting incorrect results and the warning above. Now I have regenerated all the dates for pm25_pred and run it again it is solved. Need to check if any missing data and fail safely in that case.

calvin-curtin commented 2 years ago

Could potentially add a check in batch_translate.sh to check that each yearly folder has the correct number of files (e.g. 365 or 364 if leap year) and give the user a warning and fail early so that we do not waste time converting and batching the GeoTIFFs into yearly netCDF if there are missing files.

Possibly much harder to handle this in merge_clean as it would have to open each of the batched 2001-2020 netCDF files and check record lengths.

MilesSowden commented 2 years ago

Bigger issue is the assumption that the files are in date order and a complete year. Could potentially wish to use this for one month. Created a new issue, this probably can be closed.

calvin-curtin commented 2 years ago

There will need to be different validation checks required for different time resolutions, which will become difficult to know ahead of time. I'm not sure whether there is an easy way around this for it to fail safely in the event of missing data across all time resolutions.

Concatenating netCDF files using ncecat is performed by the order of filenames specified as an input argument. ${sOutFolder}/tmp_${layer_name}_${start_year}*.nc specifies all the temporary netCDF files, sorted by name. So it would work for daily/monthly/yearly. But the time array for the time variable would be different for each resolution and you'll also have to update time attributes for the time variable accordingly (Lines 57 - 61).

calvin-curtin commented 2 years ago

@MilesSowden Given that the conversion now reads the correct date and populates the time array from that, I don't think we should encounter the above issue. Can this be closed?

MilesSowden commented 2 years ago

@ivanhanigan see Calvin's comment above. Should this issue be closed?

ivanhanigan commented 2 years ago

@MilesSowden thanks for this. I won't be able to test this yet but happy to close this now and test later. Cheers