Moving to AGS 0.5 - Githubissues

KoalaGeo commented 5 months ago

The problem:

app directs output to one of two pages: • 'Success' results page assuming validation has successfully run • 'Fail' page if exception raised by the library. I pick up the AGS4Error messages raised and print them topped and tailed with my own message.

This was working just fine on the old version (and still does – I did a revert on my code to check, running it in the 'original' virtual environment).

The problem is that I cannot get the new version of the app to 'fail', even when fed a file that should fail, e.g. with wrong number of headings in a group, as per the 'badlybroken' file attached. It always diverts to the success page, with no exception messages picked up. The ags error's then output as if validation had fully run (weird?) although the results typically reflect the expected problem.

Upon further investigation, it appears that my app is no longer picking up AGS4.AGS4Error exceptions, or indeed any exception raised. However, messages are being output to the console that show that errors are being detected. From a detailed look at both my code and python-ags4, both old and new versions, I cannot figure out why this is happening.

To take my app code out of the question, I tried running the library simply using a few lines of code. I ran it using both the 0.4.1 and 0.5.0 libraries. I did this by running the same code in two different virtual environments: • venv used for 0.4.1, with 0.4.1 active, other dependencies as they were at the time, and Python 3.8 • venv I'm using for the 0.5.0, other updated dependencies, Python 3.12 I found that 0.4.1 behaves as expected, i.e. exception raised for bad file. But for 0.5.0 no exception is thrown!

To test whether this behaviour may be to do with the Python version or dependencies I ran a few lines of code to see if there is any difference between the environments for a simple div/0 error. For this, behaviour was as expected in both cases, i.e. exception raised.

So this suggests that there may be something different going on in python-ag4, i.e. failing to throw exceptions, even though I can't see why! I know you have added the logger stuff, but can't see why that should make a difference.

This was indeed tricky to figure out. The check_file() function now catches all exceptions internally and then adds them to the error report. This is to ensure that a full error report is created for end-users even when some unforeseen error occurs. It is certainly something that should have been in the changelog.

The good news is that anything that causes an AGS4Error exception will already have an informative error message in the report. If you want your app to behave as before and give a Pass/Fail message, then I would try to detect whether an exception was raised by reading the logger or stdout with the print_output option set to True.

Still don't understand it though – I thought that Python 'raise' meant raise an exception and exit – game over!

concerns:

If an error is raised that leads to early termination of validation:

It is not clear from the main output that validation has been curtailed
There is a message in the console log – but some may not be paying attention to that (or may not have access to those messages at all)
Even if seen, the console message does not say that validation has been curtailed
Trying to unpick key messages from the logger object is horrible in practice, because a lot of stuff – traceback etc – deprecation warnings (yes, still there) - ends up in this. It also took me a while to figure out how to access it – not straightforward – difficult for low/mid-level coders like me.

I'm not saying we need to roll back, but I think we do need to look again at how these early exit events are flagged. I have some ideas we can discuss.

I know this is a pain at this late stage, but this change has only just become apparent, and I think it is pretty important. Certainly needs documenting.

Perhaps we also need to have some test cases where we deliberately crash the validator, if that is possible

volcan01010 commented 4 months ago

https://gitlab.com/ags-data-format-wg/ags-python-library

volcan01010 commented 4 months ago

Notes on upgrading to 0.5.0

Running pytest test/unit -v runs the tests on the underlying logic, which is unaffected by updates/mismatches with FastAPI and associated code.

With the upgraded library, 10 of 80 tests fail. All failures are due to extra values appearing in the error lists reported by the AGS4 checker. These values have keys such as "Summary of data" or "Warning (Related to Rule 16)" or "General".

The addition of the "Summary of data" to the list of errors means even a valid file will report at least 1 error using the new library.

Running pytest test/integration -v runs the tests on the full API. 19 tests out of 55 fail. These errors are more opaque as they are wrapped in ExceptionGroups. Some may related to changes in other dependencies e.g. the upgrade to Pandas 2.x.

volcan01010 commented 4 months ago

We are already popping the Metadata field from the list of errors and returning the contents separately. We could do similar with the Summary (or include the information in the metadata).

ximenesuk commented 4 months ago

All tests now passing, mainly through changing our fixtures to reflect changes to the library responses. Notes on commits

Rule 7 is no longer broken: for one of our test files, this might be worth looking at since it reflects a change in the rule itself
Add extra General item (two commits): Additional advisory text but inconsistent dict ordering in JSON, RAISE ISSUE
- Add utf-8 warning text: non-ascii characters in rule 1 has additional text advising of 'utf-8' assumption
- Switch table to group: change in terminology when groups are not found
- Fix data in valid example AGS file: this gets our minimal ags example file past new rules and warnings
- Change expected results for check_ags: some test files not break more rules, may be worth checking
- Add Warning and FYI to VALID_KEYS: so our parsing of the errors doesn't fail
- Discard unecessary summary from errors dictionary: just pop the 'Summary...' for now, we could add this to our response in the way we handle metadata
- Fix sort_tables in API and tests: see below
- Use sorting_strategy: alphabetical & None: XL sheets can now be sorted in multiple ways, as it is not clear how 'hierarchical' and 'dictionary' sorts work we continue to support only 'alphabetical' and None through the existing boolean flag.

KoalaGeo commented 4 months ago

Decisions on new Error messages:

Add Summary to the BGS Metadata https://github.com/BritishGeologicalSurvey/pyagsapi/issues/143 (implications for https://github.com/BritishGeologicalSurvey/pyagsapi/issues/95)
AGS Check only returns Summary?
BGS & AGS Check returns Metadata & Summary?
Add Warning and FYI separately to Errors, so theres FYI, Warning and Errors groups in the response. They can then be listed independently by the gui

BritishGeologicalSurvey / pyagsapi

Moving to AGS 0.5 #131

Notes on upgrading to 0.5.0