Processing a worksheet creates an EmptyField instance for columns when the worksheet max_rows extends beyond the data table. These trailing EmptyField instances are then used to validate data rows by looking for invalid content. If there is none, then the worksheet is just reading in extra fields because it does sometimes 🤷 .
However at the moment, the number of rows and the row metadata being written to the dataset metadata include these empty fields. These need to be removed/updated.
I think the best place to do this is in the Dataworksheet.to_dict method, rather than heavily updating the class itself: we just want the metadata to be accurate and the internal representation of the worksheet is best left close to the actual source worksheet.
Processing a worksheet creates an
EmptyField
instance for columns when the worksheetmax_rows
extends beyond the data table. These trailingEmptyField
instances are then used to validate data rows by looking for invalid content. If there is none, then the worksheet is just reading in extra fields because it does sometimes 🤷 .However at the moment, the number of rows and the row metadata being written to the dataset metadata include these empty fields. These need to be removed/updated.
I think the best place to do this is in the
Dataworksheet.to_dict
method, rather than heavily updating the class itself: we just want the metadata to be accurate and the internal representation of the worksheet is best left close to the actual source worksheet.