Open frreiss opened 3 years ago
Recent versions of Watson Discovery have made undocumented changes to the format of the output of the Table Understanding enrichment. The old column names are documented at https://cloud.ibm.com/docs/discovery-data?topic=discovery-data-understanding_tables#table-output-schema
Rough translation of field names into the new naming convention:
new_name_to_old = { "row_min": "row_index_begin", "row_max": "row_index_end", "column_min": "column_index_begin", "column_max": "column_index_end", "cell_text": "text", "id": "cell_id" }
Also, the field location at the top of the table record now appears to be optional.
location
table
Our conversion to Pandas needs to be updated to cover both the old schema and the new schema.
I recommend that we first determine which schema is the canonical one and convert non-canonical schemas to the canonical one as a preprocessing step.
Recent versions of Watson Discovery have made undocumented changes to the format of the output of the Table Understanding enrichment. The old column names are documented at https://cloud.ibm.com/docs/discovery-data?topic=discovery-data-understanding_tables#table-output-schema
Rough translation of field names into the new naming convention:
Also, the field
location
at the top of thetable
record now appears to be optional.Our conversion to Pandas needs to be updated to cover both the old schema and the new schema.
I recommend that we first determine which schema is the canonical one and convert non-canonical schemas to the canonical one as a preprocessing step.