One case has "year" = "2001" and "date" = "11Nov2000". It is because the "year" is used as an identifier of the census, and is repeated across individuals regarding of the specific date. If the census happens during two different natural years, this may cause problems, depending on how we handle the dates.
This reinforces the idea that we need a robust sub-routine for dates.
It could be good to add text in "date" saying "this could vary between trees within the same plot, if they were measured in different dates", in contrast to text in "census id" saying "this should not change between trees measured in the same census, even if they were measured in different dates". This is rather obvious, but having similar texts highlighting the differences between variables can help.
using "year" as a census id seems common. Also, when storing data in the wide format, people use the year sometimes, e.g. dbh_2000, dbh_2003, dbh_2006, etc.
One case has "year" = "2001" and "date" = "11Nov2000". It is because the "year" is used as an identifier of the census, and is repeated across individuals regarding of the specific date. If the census happens during two different natural years, this may cause problems, depending on how we handle the dates.
This reinforces the idea that we need a robust sub-routine for dates.
It could be good to add text in "date" saying "this could vary between trees within the same plot, if they were measured in different dates", in contrast to text in "census id" saying "this should not change between trees measured in the same census, even if they were measured in different dates". This is rather obvious, but having similar texts highlighting the differences between variables can help.
using "year" as a census id seems common. Also, when storing data in the wide format, people use the year sometimes, e.g. dbh_2000, dbh_2003, dbh_2006, etc.