Open M-Nicholls opened 8 years ago
I would like to make the case for keeping FIRST_OF_MONTH. If more than 10 per cent of records have this flag, this means that there are more than three times as many records from the first of the month than any other day of the month. This is much more likely to be caused by the fact that, if the day of the month is unknown, it is often stored as the first of the month (as you can't have 0 days and months in most RDMSs), than by indeed more collections or observations being made on the first of the month.
FIRST_OF_MONTH is an important flag; in fact more important than FIRST_OF_YEAR, as records for which only the event day is not known are much more frequent than those for which also the month is unknown. There is not really a test for the first day of the century, is there?
First day of the century might be useful for identifying dates that were two-digits before conversion to ISO8601 and may need investigation. Ie, if they were "00" or "0" in the original file they may turn out to be first day of either 1900 or 2000 depending on how the ISO8601 conversion occurs.
Surely the invalidCollectionDate test takes care of these situations? This is one that I accidentally delivered: http://avh.ala.org.au/occurrences/119a1287-f0ed-4154-b3cf-7e6ce1ee5834.