fgpv-vpgf / rcs

RAMP Configuration Service
http://fgpv-vpgf.github.io/rcs
1 stars 8 forks source link

Registration Failure #100

Closed jvanulde closed 6 years ago

jvanulde commented 6 years ago

Trying to register:

{
  "en": {
    "service_url": "http://geoappext.nrcan.gc.ca/arcgis/rest/services/FGP/remote_communities_2017/MapServer",
    "service_type": "esriMapServer",
    "metadata": {
        "metadata_url": "https://gcgeo.gc.ca/geonetwork/srv/eng/xml.metadata.get?uuid=ac6bac30-fe78-4444-8698-324e01788598",
        "catalogue_url": "https://gcgeo.gc.ca/geonetwork/metadata/eng/ac6bac30-fe78-4444-8698-324e01788598"
    },
    "service_name": "Remote communities by main power source",
    "scrape_only": [0],
    "recursive": true
  },
  "fr": {
    "service_url": "http://geoappext.nrcan.gc.ca/arcgis/rest/services/FGP/remote_community_fr_2017/MapServer",
    "service_type": "esriMapServer",
    "metadata": {
        "metadata_url": "https://gcgeo.gc.ca/geonetwork/srv/fre/xml.metadata.get?uuid=ac6bac30-fe78-4444-8698-324e01788598",
        "catalogue_url": "https://gcgeo.gc.ca/geonetwork/metadata/fre/ac6bac30-fe78-4444-8698-324e01788598"
    },
    "service_name": "Icônes pour les collectivités par type de carburant",
    "scrape_only": [0],
    "recursive": true
  },
  "version": "2.0"
}

Get this error:

{"msg": "Error: Metadata URL: \"https://gcgeo.gc.ca/geonetwork/srv/eng/xml.metadata.get?uuid=ac6bac30-fe78-4444-8698-324e01788598\" could not be retrieved: expected \"['application/xml', 'text/xml']\", got \"text/html;charset=UTF-8\""}

put failed 500
james-rae commented 6 years ago

If I browse to that metatdata link I get re-directed to a log-in page. My guess is RCS is receiving the HTML content of that page when it is expecting metadata XML.

Unless anyone has alternatives to suggest, I believe an endpoint that does not redirect would be required

jvanulde commented 6 years ago

I suspect it's because the metadata record hasn't been published yet so when RCS does the check it can't get to it, hence the catalogue redirecting to the log in. Just out of curiosity, why is RCS checking the resource in the first place?

jvanulde commented 6 years ago

This one fails too even though the record is public:

{
"en": {
    "service_url": "https://geoappext.nrcan.gc.ca/arcgis/rest/services/FGP/remote_communities_2017/MapServer",
    "service_type": "esriMapServer",
    "metadata": {
        "metadata_url": "https://dev.gcgeo.gc.ca:443/geonetwork/srv/eng/xml.metadata.get?uuid=0dfd4e0f-30aa-412b-b04c-73fe43ecbfc2",
        "catalogue_url": "https://dev.gcgeo.gc.ca:443/geonetwork/metadata/eng/0dfd4e0f-30aa-412b-b04c-73fe43ecbfc2"
    },
    "service_name": "Remote communities by main power source",
    "scrape_only": [0],
    "recursive": true
},
"fr": {
    "service_url": "https://geoappext.nrcan.gc.ca/arcgis/rest/services/FGP/remote_community_fr_2017/MapServer",
    "service_type": "esriMapServer",
    "metadata": {
        "metadata_url": "https://dev.gcgeo.gc.ca:443/geonetwork/srv/fre/xml.metadata.get?uuid=0dfd4e0f-30aa-412b-b04c-73fe43ecbfc2",
        "catalogue_url": "https://dev.gcgeo.gc.ca:443/geonetwork/metadata/fre/0dfd4e0f-30aa-412b-b04c-73fe43ecbfc2"
    },
    "service_name": "Icônes pour les collectivités par type de carburant",
    "scrape_only": [0],
    "recursive": true
},
"version": "2.0"
}

Test it here: http://160.106.128.92/static/test.html

Result:

{
    "msg": "Metadata URL: \"https://dev.gcgeo.gc.ca:443/geonetwork/srv/eng/xml.metadata.get?uuid=0dfd4e0f-30aa-412b-b04c-73fe43ecbfc2\" request failed with error \"HTTPSConnectionPool(host='dev.gcgeo.gc.ca', port=443): Max retries exceeded with url: /geonetwork/srv/eng/xml.metadata.get?uuid=0dfd4e0f-30aa-412b-b04c-73fe43ecbfc2 (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond',))\""
}

2018-07-26T14:33:57.415Z put failed 400
james-rae commented 6 years ago

Just out of curiosity, why is RCS checking the resource in the first place?

I wasn't involved when the check was added, but doing a quick look at the code, it appears to be a data integrity process. If we allowed invalid links to be registered, the map viewer would then error when someone attempts to open the metadata panel. The bug report would come to us. My take on it, anyway.

My associate Barry is taking a look at the error message you posted above...

jvanulde commented 6 years ago

Makes sense @james-rae. However it's kinda wierd that viewer business logic/contraints are managed in RCS.

barryytm commented 6 years ago

There was an error when requesting the catalogue url . The response in python differs from when requesting on the browser. I suspect there was some kind of redirection going on.

Here is the error response from the catalogue url and it seems to be an error handling page (It was returned as text in HTML but I've rendered it).

capture

The home page links are /geonetwork/home/eng. and /geonetwork/home/fre. Do you recognize this error page @jvanulde ?

jvanulde commented 6 years ago

I don't recognize that page. What I suspect is happening is that the RCS registration is occurring before the metadata record publication has completed. I have pinged the GeoNetwork team and they feel that this is a possibility. I will close if true...

However, I can access that resource with my browser. So I am not sure why python is getting HTML.

mweech commented 6 years ago

@jvanulde RCS is a tightly coupled component with RAMP as its sole purpose is to pre-cache various layer characteristics in order to speed up loading when users populate a cart of layers in the catalogue. If we did not do the pre-cache, there would be a much larger up front loading time for users (especially in IE).

RCS is very much responsible for ensuring data integrity before RAMP tries to load the record. If we did not have this, the RCS database would slowly become corrupted with bad entries (that would have been caught by validation checks) and require hands on updates to fix, OR require updates via the catalogue until they register correctly. The prevailing assumption here is that if a record is created with URL's for users to access Catalogue/Metadata records, than these should be available and not have restricted access. I would not agree that this is the same as implementing business logic for the viewer.

I believe there is an option to disable this strict registration mode, but then you must be ready to assume maintenance of the RCS database as there is no longer a guarantee these records will be usable when loaded in RAMP and may require lots of registration updates, or directly modifying the database to correct issues.

TBH, the issue here is that the business process has been changed and RCS/Catalogue/Viewer are being used to register and visualize layers that are unpublished/draft and hosted on password protected systems and that was not the intention at the time that these components were developed. Also, at the point when these were developed, the Data Catalogue was not a stable or predictable product to attempt to work with, so this loose coupling was necessary to push ahead. If we were to develop this over again, perhaps the RCS would be more of an asynchronous component that monitors GeoNetwork for layers to be registered with a map layer and scrapes the data catalogue to populate the configuration cache. But hindsight is always 20/20.

mweech commented 6 years ago

Closing. Metadata URL gives an error page instead of a valid document. Error message is reporting accurately.