LTER-LIFE / VeluweProtoDT

Veluwe proto-DT: a digital mini twin of tree phenology and climate scenarios
0 stars 0 forks source link

Fix potential errors and inconsistencies in bud burst data #21

Open StefanVriend opened 10 months ago

StefanVriend commented 10 months ago

There are some inconsistencies (e.g., missing coordinates, inconsistent spelling) and errors (e.g., duplicated records) in the bud burst data. These need to be dealt with.

StefanVriend commented 10 months ago

The following issues have been dealt with:

StefanVriend commented 10 months ago

On Friday 15/12/2023, we fixed a few instances where trees go back to a previous bud burst stage (i.e., TreeTopScore at t is lower than TreeTopScore at t-1), which is biologically impossible.

On Monday 18/12/2023, we fixed a large number of instances where bud burst stages were not as expected. In most years, trees are scored inbetween 0 and 3 (with a 0.5-interval); in some years (1989-1990, 2001-2008) trees are also scored at 0.25-intervals. All other scores are incorrect and have been fixed. Most errors were due to rounding errors (1.8 instead of 1.75) or missing digits (5 instead of 0.5). Some other values were set to NA (after verifying the field books).

We did not check all TreeAllScores, which still contain some of the types of errors like the ones we found for TreeTopScore.

CherineJ commented 10 months ago

There are also errors in the TreeAllScores, which are similar to the once of TreeTopScore (scoring in 0.25 steps and rounding issues, missing digits, typos). We checked them in the field books as well (Thursday 21/12/2023) and forwarded them to the AnE database to be corrected.

CherineJ commented 9 months ago

All errors in the TreeTopScore & TreeAllScore have been fixed on 08/01/2024. Additionally, missing observer IDs have been assigned that are needed to fix #3.

CherineJ commented 8 months ago

Takeaway

We were able to solve most of the problems in the original data because we are the data owner and we could go back to the field books, talk to Marcel etc. This is however unlikely the case if the data comes from elsewhere and can therefore be a more severe problem, especially if metadata about a contact person etc. is missing.