geco-bern / FluxDataKit

LEMONTREE flux data kit
https://geco-bern.github.io/FluxDataKit
2 stars 11 forks source link

Adjust end of year processing #84

Closed SamanthaBiegel closed 1 month ago

SamanthaBiegel commented 4 months ago

This PR makes a small change to determining the final year of the sequences per site. The current approach does this based on the day of the year, which introduces a difference in outcome between leap years and non-leap years. Particularly, since most sequences end on December 30th instead of 31st, the sequence includes this final year for leap years while it doesn't for non-leap years.

I suggest to set the cutoff date to December 30th, since otherwise the final almost-complete year is removed for 170 sites. What do you think @stineb ?

stineb commented 4 months ago

Did you see an unwanted pattern in the result of this function? I think your suggestion is equivalent (gives the same result) as the original code. Note that the respective line of code gets the year, not the date.

SamanthaBiegel commented 3 months ago

Many sites in the dataset end on Dec. 30, not Dec. 31. Since the respective line of code filters based on the day of the year, this leads to a different treatment of leap years vs non-leap years. See the following example output:

> lubridate::yday("2023-12-30") [1] 364 > lubridate::yday("2024-12-30") [1] 365

This would cause 2024 to be included in the dataset but not 2023.

stineb commented 2 months ago

This should be solved by PR #93. FluxDataKit v3.2 is generated with code changes of #93 (shortly available on Zenodo). Please confirm @SamanthaBiegel