BritishGeologicalSurvey / pyagsapi

pyagsapi - An AGS Utilities API with AGS v4.x Schema Validation & Converter for .ags<-->.xslx files
https://britishgeologicalsurvey.github.io/pyagsapi/
GNU Lesser General Public License v3.0
12 stars 2 forks source link

Allows creation of invalid .ags from .xlsx files #53

Closed KoalaGeo closed 7 months ago

KoalaGeo commented 2 years ago

To reproduce AGS data export from Oracle to create hand crafted xlsx files of AGS data Convert to .ags using the agsapi Creating invalid ags files as theres issues with the source xlsx.....

Expected behaviour

Validate data - if not valid don't perform conversion - print error log

MMG Test.xlsx Validation.txt

ximenesuk commented 1 year ago

The conversion needs to happen to perform the validation. So I would expect the behaviour here is to just provide the validation output but not return the .ags file?

Should the validation be against the AGS rules only?

volcan01010 commented 1 year ago

Validating test/files/example_ags.ags gives no errors. But when you convert it to an .xlsx file and try to convert it back it fails. The validation report on the .ags created from the .xlsx is as follows:

 {'filename': 'example_ags.xlsx',
  'filesize': 4039,
  'time': datetime.datetime(2023, 2, 28, 17, 42, 33, 986140, tzinfo=datetime.timezone.utc),
  'message': 'ERROR: Converted file is not valid AGS',
  'dictionary': 'Standard_dictionary_v4_1_1.ags',
  'errors': {'AGS Format Rule 15': [{'line': 41,
     'group': 'UNIT',
     'desc': 'Unit "DegC" not found in UNIT table.'},
    {'line': 41,
     'group': 'UNIT',
     'desc': 'Unit "bar" not found in UNIT table.'},
    {'line': 41,
     'group': 'UNIT',
     'desc': 'Unit "yyyy-mm-ddThh:mm" not found in UNIT table.'}]},
  'checkers': ['python_ags4 v0.4.1'],
  'valid': False,
  'additional_metadata': {}}
volcan01010 commented 1 year ago

This should not really be a surprise, as these missing units are not used within the file. Perhaps they are required by the 4.1.1 dictionary?

volcan01010 commented 1 year ago

They are: https://gitlab.com/ags-data-format-wg/ags-python-library/-/blob/main/python_ags4/Standard_dictionary_v4_1_1.ags

So maybe we need to create an example AGS file that is based on the 4.1.1 dictionary so that we can make our tests pass? The current one is based on the 4.1 dictionary.

Or we need to modify the API so that you can specify the dictionary to use during the conversion. It seems like that would be the best thing to do. But it is a wee bit extra work.

@KoalaGeo - everthing else is merged. Do you want to release without this feature? I'm on a sprint for something else this week.

ximenesuk commented 1 year ago

Or we need to modify the API so that you can specify the dictionary to use during the conversion. It seems like that would be the best thing to do. But it is a wee bit extra work.

I think this should be the preferred solution for the API, allow the full functionality of the underlying library to be accessed. If after that ags -> xlsx -> ags shows differences for a fixed dictionary then that's a library issue.

Whether the web page offers a choice of dictionary or just uses the latest is another thing.

KoalaGeo commented 1 year ago

They are: https://gitlab.com/ags-data-format-wg/ags-python-library/-/blob/main/python_ags4/Standard_dictionary_v4_1_1.ags

So maybe we need to create an example AGS file that is based on the 4.1.1 dictionary so that we can make our tests pass? The current one is based on the 4.1 dictionary.

Or we need to modify the API so that you can specify the dictionary to use during the conversion. It seems like that would be the best thing to do. But it is a wee bit extra work.

@KoalaGeo - everthing else is merged. Do you want to release without this feature? I'm on a sprint for something else this week.

Yeah, this & the accepting .zip can wait for a 4.1 when you're next able to look at it.

FYI going to be some messy trial & error with releases & the ci until I get it to work. Currently won't build tagged releases - only latest

KoalaGeo commented 7 months ago

Closing as behavior is as expected