Cambridge-ICCS / ONEFlux

Open Network-Enabled Flux processing pipeline
Other
0 stars 0 forks source link

Matlab ustar_cp with sample data produces no output files (as not enough significant points) #11

Closed j-emberton closed 3 months ago

j-emberton commented 4 months ago

ONEflux fails to generate output files from ustar_cp calculation when run using US-ARc sample data.

05_ustar_cp folder should contain inputs, log file report and outputs. The outputs should be in a format such as:

/path/to/main/data/dir/05_ustar_cp/SITE123_uscp_2021.txt /path/to/main/data/dir/05_ustar_cp/report_20230721093015.txt

Sample output data does not contain ustar_cp output files either and the log file contents between csd3 and the sample output are similar implying that the CSD3 case runs similarly to that used to analyse the sample data.

There are several possible scenarios resulting in this situation:

report_rocky.lbl.gov_run20180903T192021.txt

j-emberton commented 4 months ago

Dom sugegsts trying to track back through code to link log file message to how ustar cp is processing data: what triggers message and where?

Can we grab an alternative Fluxnet dataset and try running that?

j-emberton commented 3 months ago

So, after some digging round I've figured out what the issue is.

After the data is processed, there are statistical tests to test the significance of the data. If there are too few statistically significant points, the "processing n.01, US-ARc_qca_ustar_2005.csv...Too few selected change points: 0/400" type message is written to the log file. This is on a per file basis.

So The code "seems" to be running correctly, however the available data does not allow us to test the full code operational behaviour without a new or extended dataset which contains sufficient data for statistical significance.

We can develop a test for this limited behaviour (which is still useful as this evaluation occurs later in the MATLAB code) but it wouldn't test the full code behaviour with full output reports based on the statistically significant data

dorchard commented 3 months ago

We need more test data (Or different test data). We will discuss with Gilberto but possibly this can be closed.

dorchard commented 3 months ago

We now understand that the lack of output files is due to the sample data not having enough statistically significant results. We have now test data sets that provide us what we need, so closing.