metno / discovery-metadata-catalog-ingestor

Apache License 2.0
1 stars 1 forks source link

pycsw_dist update is failing #236

Closed mortenwh closed 2 weeks ago

mortenwh commented 1 month ago

The recurring issue with pycsw returning Record serialization failed: Start tag expected, \'<\' not found, line 1, column 1 (<string>, line 1) is caused by the update command in dmci.

We need to check the update functionality and see if we can change something in DMCI or if it is really a problem in pycsw.

magnarem commented 1 month ago

It could be that something fails in the csw_dist. Its very hacky and complex with usage of the requests-module to do this requests to csw.

Would it not be a much better approach to actually use the owslib.csw for doing inserting and updating after the incoming mmd is translated to iso?

The owslibalso probably give back more sane error messages etc. And it have more control over transactions against csw etc.

mortenwh commented 1 month ago

I have thought a bit about this, and I am pretty sure that the pycsw update method only takes in the changes, not the entire xml file when making updates. This is a bit challenging to encode, so the easiest solution is probably to just delete and reinsert the dataset. Note that with #239, we should force the deletion to avoid just changing the MMD file.

magnarem commented 1 month ago

Ok. That probably makes sense, also when I look at the owslib.csw docs ( https://owslib.readthedocs.io/en/latest/usage.html#csw), there is no exaple, of using update and update the whole xml. Just change values of the fields. So then the solution is probably to modify the update function in pycsw_dist so that it will have to first call the delete function and then the insert function in pycsw_dist for the update functionality to work correctly for pycsw_dist.