INSPIRE-MIF / helpdesk-geoportal

Community discussion for INSPIRE geoportal topics
11 stars 3 forks source link

Problem with harvesting (LUX): availability through view services randomly reported missing although "Resource linkages checker tool" shows that it is available #79

Closed MatAlps closed 2 years ago

MatAlps commented 2 years ago
MatAlps commented 2 years ago

Hi,

Could you have a look at this?

We still have the problem.

In our last harvest the following record https://catalog.inspire.geoportail.lu/geonetwork/srv/eng/catalog.search#/metadata/2a18c162-b9fd-4097-b367-e47de238f4f5 is found without download & view link: image

But performing the test manually with the link checker https://inspire-geoportal.ec.europa.eu/linkagechecker.html shows no issue: image With

Could you:

Thx!

jrc-inspire commented 2 years ago

The functionality of the INSPIRE Geoportal backend is being migrated to a GeoNetwork-based architecture.

Despite were not planning to address open issues affecting the current INSPIRE Geoportal backend because of the mentioned reason, we are investigating the possible cause of the reported drop in the number of viewable datasets for your endpoint. At the moment, without any results.

It would be helpful if you could check in parallel the logs in your infrastructure to identify any potential issues from your side.

Please contact us at JRC-INSPIRE-SUPPORT@ec.europa.eu in case you have any urgent request.

MatAlps commented 2 years ago

Hi,

Thanks a lot for your answer. We are checking our logs and found indeed a lot of timeouts, NGINX status 499 "Client closed connection". Also that the calls from the validator are most of the time unique (very good).

Those 499 happen when the client (=validator in this case) close the connection. We cannot at the moment have the request time on our side - we are working on it. In the meanwhile, it would help us a lot if you could tell us what timeout value in ms the validator uses before closing the connection. Would you have this information readily available for us?

Thx in advance

Mathieu

jescriu commented 2 years ago

Dear @MatAlps, We will analyse your request and we will come back to you as soon as possible. Best, Jordi

MatAlps commented 2 years ago

Dear @jescriu ,

Thx for your answer.

In the meanwhile we have measured 5 seconds timeout for GetMap, and 10 seconds for GetCapabilities. We are working to improve our side to be able to provide response times below those values during a harvest.

Worth noting: we observed that during a harvest, our geoserver, that is distributing allmost all the INSPIRE layers for LUX, receives more than 500 requests in a few minutes. One could argue that this generates unusual conditions of use, while the 5 and 10 seconds timeouts are expected for normal conditions of use. So it is arguable that for centralized INSPIRE infrastructures, the harvester excessively overloads the visualisation service to be able to fairly assess its accessibility.

While we are investing to meet the current requirements, we would like to know your opinion on the topic. Do you agree? Have you received similar feedback from other countries?

Best regards,

Mathieu

jescriu commented 2 years ago

Dear @MatAlps, Regarding your post+question quoted below:

We are checking our logs and found indeed a lot of timeouts, NGINX status 499 "Client closed connection". Also that the calls from the validator are most of the time unique (very good).

Those 499 happen when the client (=validator in this case) close the connection. We cannot at the moment have the request time on our side - we are working on it. In the meanwhile, it would help us a lot if you could tell us what timeout value in ms the validator uses before closing the connection. Would you have this information readily available for us?

Just to double-check with you that you are referring to the INSPIRE Reference Validator, and not to the INSPIRE Geoportal - Could you confirm? Not sure if we are mixing 2 different topics here.

Regarding you last post quoted below:

In the meanwhile we have measured 5 seconds timeout for GetMap, and 10 seconds for GetCapabilities. We are working to improve our side to be able to provide response times below those values during a harvest.

Worth noting: we observed that during a harvest, our geoserver, that is distributing allmost all the INSPIRE layers for LUX, receives more than 500 requests in a few minutes. One could argue that this generates unusual conditions of use, while the 5 and 10 seconds timeouts are expected for normal conditions of use. So it is arguable that for centralized INSPIRE infrastructures, the harvester excessively overloads the visualisation service to be able to fairly assess its accessibility.

While we are investing to meet the current requirements, we would like to know your opinion on the topic. Do you agree? Have you received similar feedback from other countries?

I agree that this worths a discussion, since it not the aim of the INSPIRE Geoportal to diminish the performance of data providers' services. I will bring this topic for internal discussion as soon as possible, also taking into account the established quality requirements for INSPIRE services.

At the moment, we have not received similar feedback from other MS, but it worths investigating if these timeouts could be causing a decrease in the downloadable and viewable datasets indicators.

Could you share with us the logs of your services?

Best, Jordi

MatAlps commented 2 years ago

Hi Jordi,

Thx for your answer and the intention of bringing this topic for internal discussion.

I am talking about what happens when we launch an harvest from the harvest console of the luxembourg INSPIRE catalog. I recognize that I did not make that clear.

Please find the logs we have for an harvest done on 21/04 : 79 logs 21-04-2022 harvest LUX.xlsx

Best regards,

Mathieu

jescriu commented 2 years ago

Dear @MatAlps, Thank you for the clarification. Therefore, you only refer to the INSPIRE Geoportal harvesting process. I will share the logs with our INSPIRE Geoportal IT team, for evaluation. All the best.

jescriu commented 2 years ago

Dear @MatAlps,

Thank you for your patience.

Our INSPIRE Geoportal IT team made (quite recently) some quality checks and improvements in the deployment of the INSPIRE Geoportal which is available online to all Member States and EFTA countries through the harvest console.

A new harvest of the Luxembourg discovery service endpoint was executed on 5th May, providing better results - See below: image

As part of the analysis made for the improvements highlighted above, we considered if the current processes of the INSPIRE Geoportal to check viewability and downloadability of datasets could be hindering the availability of the own services being checked. Finally, this seemed to be not the issue this time.

However, thanks to you valuable analysis, we identified aspects of improvement for the mentioned processes. We will for sure take them into account in the future, as part of the improvement of the new INSPIRE Geoportal backend based on GeoNetwork, which will be made available soon.

All the best.