BritishGeologicalSurvey / pyagsapi

pyagsapi - An AGS Utilities API with AGS v4.x Schema Validation & Converter for .ags<-->.xslx files
https://britishgeologicalsurvey.github.io/pyagsapi/
GNU Lesser General Public License v3.0
12 stars 2 forks source link

Moving to AGS 0.5 #131

Open KoalaGeo opened 5 months ago

KoalaGeo commented 5 months ago

The problem:

app directs output to one of two pages: • 'Success' results page assuming validation has successfully run • 'Fail' page if exception raised by the library. I pick up the AGS4Error messages raised and print them topped and tailed with my own message.

This was working just fine on the old version (and still does – I did a revert on my code to check, running it in the 'original' virtual environment).

The problem is that I cannot get the new version of the app to 'fail', even when fed a file that should fail, e.g. with wrong number of headings in a group, as per the 'badlybroken' file attached. It always diverts to the success page, with no exception messages picked up. The ags error's then output as if validation had fully run (weird?) although the results typically reflect the expected problem.

Upon further investigation, it appears that my app is no longer picking up AGS4.AGS4Error exceptions, or indeed any exception raised. However, messages are being output to the console that show that errors are being detected. From a detailed look at both my code and python-ags4, both old and new versions, I cannot figure out why this is happening.

To take my app code out of the question, I tried running the library simply using a few lines of code. I ran it using both the 0.4.1 and 0.5.0 libraries. I did this by running the same code in two different virtual environments: • venv used for 0.4.1, with 0.4.1 active, other dependencies as they were at the time, and Python 3.8 • venv I'm using for the 0.5.0, other updated dependencies, Python 3.12 I found that 0.4.1 behaves as expected, i.e. exception raised for bad file. But for 0.5.0 no exception is thrown!

To test whether this behaviour may be to do with the Python version or dependencies I ran a few lines of code to see if there is any difference between the environments for a simple div/0 error. For this, behaviour was as expected in both cases, i.e. exception raised.

So this suggests that there may be something different going on in python-ag4, i.e. failing to throw exceptions, even though I can't see why! I know you have added the logger stuff, but can't see why that should make a difference.


This was indeed tricky to figure out. The check_file() function now catches all exceptions internally and then adds them to the error report. This is to ensure that a full error report is created for end-users even when some unforeseen error occurs. It is certainly something that should have been in the changelog.

The good news is that anything that causes an AGS4Error exception will already have an informative error message in the report. If you want your app to behave as before and give a Pass/Fail message, then I would try to detect whether an exception was raised by reading the logger or stdout with the print_output option set to True.


Still don't understand it though – I thought that Python 'raise' meant raise an exception and exit – game over!

concerns:

If an error is raised that leads to early termination of validation:

I'm not saying we need to roll back, but I think we do need to look again at how these early exit events are flagged. I have some ideas we can discuss.

I know this is a pain at this late stage, but this change has only just become apparent, and I think it is pretty important. Certainly needs documenting.

Perhaps we also need to have some test cases where we deliberately crash the validator, if that is possible

volcan01010 commented 4 months ago

https://gitlab.com/ags-data-format-wg/ags-python-library

volcan01010 commented 4 months ago

Notes on upgrading to 0.5.0

Running pytest test/unit -v runs the tests on the underlying logic, which is unaffected by updates/mismatches with FastAPI and associated code.

With the upgraded library, 10 of 80 tests fail. All failures are due to extra values appearing in the error lists reported by the AGS4 checker. These values have keys such as "Summary of data" or "Warning (Related to Rule 16)" or "General".

The addition of the "Summary of data" to the list of errors means even a valid file will report at least 1 error using the new library.

Running pytest test/integration -v runs the tests on the full API. 19 tests out of 55 fail. These errors are more opaque as they are wrapped in ExceptionGroups. Some may related to changes in other dependencies e.g. the upgrade to Pandas 2.x.

volcan01010 commented 4 months ago

We are already popping the Metadata field from the list of errors and returning the contents separately. We could do similar with the Summary (or include the information in the metadata).

ximenesuk commented 4 months ago

All tests now passing, mainly through changing our fixtures to reflect changes to the library responses. Notes on commits

KoalaGeo commented 4 months ago

Decisions on new Error messages: