FreeUKGen / MyopicVicar

MyopicVicar (short-sighted clergyman!) is an open-source genealogy record database and search engine. It powers the FreeREG database of parish registers, the FreeCEN database of census records, the next version of FreeBMD database of Civil Registration indexes and other Genealogical applications.
47 stars 16 forks source link

Validation of extended CSV fields is not being done properly #2759

Open stsccfr opened 3 weeks ago

stsccfr commented 3 weeks ago

I discovered two files in the Staffordshire database each of which contained an invalid extended CSV field, and yet the FreeREG server had accepted them. I then uploaded a baptism file, STSABCBA.CSV, which had BAPTISMS in the first header line, but whose FlexCSV fields were all for marriages (marriage_date, groom_name, etc). That file was also accepted by the server. Another consequence of the bad processing is that the date range reported for the file is wrong. The file contains a single entry dated 1608, but the date range reported by the server (in the email it sends out) is 2024-1300.

STSABCBA.CSV

I think the culprit is in the function for valid_field_definition() in lib/new_freereg_csv_update_processor.rb and is this line:

entry_fields = Freereg1CsvEntry.attribute_names

which defines the valid field names to be the entire collection of named attributes in the Freereg1CsvEntry class, which is all FlexCSV fields for all 3 event types (baptisms, marriages, burials), and anything else that happens to be defined in that class. This means that one can construct a CSV file using any random selection of the field names for the 3 events and the server will accept it. Proper validation of uploaded files is important especially given that they don't all come from a single source.

Instead of the class attribute names, I think we should be using the lists of event-specific FlexCSV field names defined in lib/freereg_options_constants.rb