INSPIRE-MIF / helpdesk-geoportal

Community discussion for INSPIRE geoportal topics
11 stars 3 forks source link

AT: Trigger new harvest #210

Open manilly opened 3 days ago

manilly commented 3 days ago

Hi, could you please trigger a new harvest for Austrian CSW and Publish the results in Geoportal Frontend? Thanks, Manuel

jescriu commented 3 days ago

Dear @manilly, I just triggered the harvesting process for your endpoint but it suddenly stops before finishing with all records. Could you please check if your server is stable? Now I will start harvesting DE, which was already scheduled for today. When finished, I will try with your endpoint again.

manilly commented 2 days ago

Our endpoint seems to work fine. Could you please try again?

jescriu commented 2 days ago

Will do. DE also had the same issue after having harvested 3K records. Our ICT team is checking the production console.

jescriu commented 2 days ago

Dear @manilly, We analised the logs for AT and DE harvests launch yesterday and the issue seems to be caused by errors in the GetRecord function returning fewer records than actually requested (20 at a time, as configures in the harvester).
This may be due to incorrect indexing of some records. We did some harvest to other endpoints and completed correctly. Could you please run a reindexation in your catalogue (AT) and let us know when finished? We are doing the same in the production harvesting console.

manilly commented 2 days ago

Dear @jescriu, I just did a reindexing. Could you try again? We currently have 1682 metadata entries.

manilly commented 2 days ago
image
manilly commented 2 days ago

I just tried a Harvest in Sandbox - only 1643 instead of 1682 metadata were harvested. There are datasets, services and series missing. I'll try to figure out which are missing

jescriu commented 2 days ago

Please, take care that all records are correctly reindexed. Sometimes it happens that a records is not accessible (e.g. the XML view is blank) due to an invalid XML structure. Worth to check with e.g. Notepad++ Obviously, these records are neither accessible through the CSW.

manilly commented 2 days ago

e.g. this one wasn't harvested in sandbox: https://geometadaten.lfrz.at/at.lfrz.discoveryservices/srv/ger/csw?service=CSW&version=2.0.2&request=GetRecordById&outputschema=http://www.isotc211.org/2005/gmd&elementSetName=full&id=f2e11a84-cdc7-4cfa-b048-da3675d58704 I can't find any difference to a similar metadata record which was successfully harvested (https://geometadaten.lfrz.at/at.lfrz.discoveryservices/srv/ger/csw?service=CSW&version=2.0.2&request=GetRecordById&outputschema=http://www.isotc211.org/2005/gmd&elementSetName=full&id=3ebf4590-e670-448c-ac38-5a04c7d96307)

jescriu commented 2 days ago

Thanks for the example. We will take a look if needed. Let us finish the reindexation also on our side. It takes long (~ 341700 metadata records).

jescriu commented 1 day ago

Dear @manilly - After the reindexation, we got the same harvesting error after a while. Any results on the checks on your side?

manilly commented 1 day ago

I tried an external harvesting in an blank GeoCat GN instance (3.12.13) and sucessfully got 1678 records.

image
jescriu commented 1 day ago

We will try to figure out what is happening.