kirenbahm / ENP_TOOLS

Scripts used to pre- and post-process data
0 stars 2 forks source link

Small dataset errors #64

Closed LAGO-Support closed 3 years ago

LAGO-Support commented 3 years ago

During A2_generate_extracted_stat if you happen to only have one or two observation data points then will no data, not even the model data you do have, will be added to the station and placed in the map for use in later scripts. Its caused by how a multidimensional array’s length is calculated by the length() function, returning max(size()).

If you happen to only have one or two data points, then when A2_generate_extracted_stat calls get_daily_data2 at line 24 the for loop errors. This will attempt to loop through the rows of daily time steps, DD_ARRAY, however if will use the number of columns instead of rows as the for loop terminating condition. This array indexing will index out of bounds and throw an error. This error is then caught by A2_generate_extracted_stat at line 90, which skips the whole station, not just the particular data set that errored.

This leads to a subsequent problem during the A3 script after plot_timeseries is called. There seems to be an error in the built in matlab function for timeseries in how it handles date time values as strings for the x axis. Since it errors during the plot creation, errors from the observed data set cancels the whole plot, including the model data at the station. You can test this yourself at the Matlab command window.

TSp = timeseries([2; 4; 6; 8; 10;], [1; 2; 3; 4; 5;]) % This will create a timeseries TSp = timeseries([6;], [3;])% This will create a timeseries, with one point

Now with date times, they need to be followed by TSp.TimeInfo.Format = 'dd/mm/yy' to correctly parse the datetime value:

TSp = timeseries([2; 4; 6; 8; 10;], ['01/01/13'; '01/02/13'; '01/03/13'; '01/04/13'; '01/05/13']) % This will create a timeseries TSp = timeseries([10; 20;], ['01/01/13'; '01/02/13';]) % This will create a timeseries, last grph just has different Marker and line type settings. TSp = timeseries([10;], ['01/01/13';]) % This will error in the matlab function instead of creating the timeseries object. Semicolons don’t matter as it errors with or without.

The fix entails calling size instead of length in get_daily_data2 in order to not error on small timeseries and in plot_timeseries skipping data sets that have only one point.