Closed kkoch closed 6 years ago
@Bobfrat can you take a look? This is all I see for obs in the portal:
@kknee @kkoch We're getting an error retrieving the csw records from geonetwork using python owslib. Has anything changed recently with the geonetwork instance? I'll try to dig in from my end.
File "/Users/bobfratantonio/Documents/Dev/virtenvs/data-catalog/lib/python2.7/site-packages/owslib/csw.py", line 399, in getrecords2
self._parserecords(outputschema, esn)
File "/Users/bobfratantonio/Documents/Dev/virtenvs/data-catalog/lib/python2.7/site-packages/owslib/csw.py", line 549, in _parserecords
self.records[identifier] = MD_Metadata(i)
File "/Users/bobfratantonio/Documents/Dev/virtenvs/data-catalog/lib/python2.7/site-packages/owslib/iso.py", line 146, in __init__
self.contentinfo.append(MD_FeatureCatalogueDescription(contentinfo))
File "/Users/bobfratantonio/Documents/Dev/virtenvs/data-catalog/lib/python2.7/site-packages/owslib/iso.py", line 961, in __init__
val = i.attrib['uuidref']
File "src/lxml/lxml.etree.pyx", line 2467, in lxml.etree._Attrib.__getitem__ (src/lxml/lxml.etree.c:70664)
KeyError: 'uuidref'
Ok after some investigation it appears that this error has been occurring since 2017-12-21, which was the day the system went down for maintenance.
The dev version was hanging onto the TOC entries because it doesn’t remove entries even if new hourly catalogs drop records. On a restart, you'd see them disappear which is the case now on dev.
Geonetwork instance is up and running AFAIK - http://data.glos.us/metadata/srv/eng/main.home?.
From: Bob Fratantonio notifications@github.com Sent: Thursday, January 4, 2018 10:29 PM To: glos/myglos Cc: Subscribed Subject: Re: [glos/myglos] Obs not displaying on portal (#184)
Ok after some investigation it appears that this error has been occurring since 2017-12-21, which was the day the system went down for maintenance.
The dev version was hanging onto the TOC entries because it doesn't remove entries even if new hourly catalogs drop records. On a restart, you'd see them disappear which is the case now on dev.
- You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/glos/myglos/issues/184#issuecomment-355464293, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AE-3PuV6wlBxDEbtPeEJq7DocoS0LrD0ks5tHZcpgaJpZM4RT0cA.
Yes GN is running and has been. The only change that has been made was this week to the help.xml and I restarted it a few times to have some xsl changes take affect. But it's been running the whole time and definitely nothing new around the maintenance downtime.
Is there an intermediary process that might not have been restarted that sits between GN and the portal? I noticed that VMs named "myglos3" and "web4" are currently turned off, would one of those be hosting a translation or similar service?
I'm not sure about any intermediary services but I am able to successfully query GN for the ISO records but when I ask for any more than 270 records as part of the request the response fails. There must be some error messages in the logs. @tslawecki are you able to take a quick look in the logs. I'm not sure where to look.
@Bobfrat can you post the query string you are using? I'm also going to be checking the GN logs to see if anything funky is showing up there.
Yeah it's a POST request to http://data.glos.us/metadata/srv/eng/csw with the following body content (application/xml):
There is a place in GN to test CSW calls (http://data.glos.us/metadata/srv/eng/test.csw) and it sure looks like I can pull up all the records and shows a count of 866 which is the right number.
To troubleshoot, I plugged in your post request above and it didn't work. So I then starting putting in the parameters that differed from yours and and from the sample they provided. I don't know enough about CWS but the issue seems to be:
The issue does not appear to be GeoNetwork related (when I checked). It arises when an error occurs within a resource pointed to by the CSW records attempts to fetch. For example, this record which gives a 500 error when trying to issue a getMap request caused problems: http://tds.glos.us/thredds/wms/SM/LakeMichiganSM-Agg?request=getMap . Trying to grab this file through OpenDAP indicates there's some kind of file size truncation issue going on. Regardless, the sane behavior would be to log an error rather than bombing out, so I'm updating some code related to this and hope to have a fix shortly.
I also looked at the GN logs. Around December 10th we started getting some errors
It also started warning about several metadata records that is says metadata not found or invalid schema. These are very old (pre-me) records and look to be in FGDC not ISO format. Somewhat odd that they all of a sudden started being an issue.
However, I'm pretty sure the portal was working even after those errors so not sure if these are related or not.
2017-12-10 23:29:41,720 ERROR [geonetwork.search] - Errors occurred when trying to parse a filter: 2017-12-10 23:29:41,720 ERROR [geonetwork.search] - ---------------------------------------------- 2017-12-10 23:29:41,720 ERROR [geonetwork.search] - org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 54; cvc-complex-type.2.4.b: The content of element 'ogc:Filter' is not complete. One of '{"http://www.opengis.net/ogc":spatialOps, "http://www.opengis.net/ogc":comparisonOps, "http://www.opengis.net/ogc":logicOps, "http://www.opengis.net/ogc":_Id}' is expected. 2017-12-10 23:29:41,720 ERROR [geonetwork.search] - ---------------------------------------------- 2017-12-10 23:29:47,555 WARN [jeeves.webapp.csw] - SearchController : Metadata not found or invalid schema : 216 2017-12-10 23:29:52,738 WARN [jeeves.webapp.csw] - SearchController : Metadata not found or invalid schema : 97 2017-12-10 23:29:53,050 WARN [jeeves.webapp.csw] - SearchController : Metadata not found or invalid schema : 141 2017-12-10 23:29:53,272 WARN [jeeves.webapp.csw] - SearchController : Metadata not found or invalid schema : 168 2017-12-10 23:29:53,538 WARN [jeeves.webapp.csw] - SearchController : Metadata not found or invalid schema : 205 2017-12-10 23:29:53,982 WARN [jeeves.webapp.csw] - SearchController : Metadata not found or invalid schema : 25 2017-12-10 23:29:54,141 WARN [jeeves.webapp.csw] - SearchController : Metadata not found or invalid schema : 38 2017-12-10 23:29:54,150 WARN [jeeves.webapp.csw] - SearchController : Metadata not found or invalid schema : 34 2017-12-10 23:29:54,270 WARN [jeeves.webapp.csw] - SearchController : Metadata not found or invalid schema : 45 2017-12-10 23:29:54,278 WARN [jeeves.webapp.csw] - SearchController : Metadata not found or invalid schema : 51 2017-12-10 23:30:00,323 WARN [jeeves.webapp.csw] - SearchController : Metadata not found or invalid schema : 149 2017-12-10 23:30:00,376 WARN [jeeves.webapp.csw] - SearchController : Metadata not found or invalid schema : 210 2017-12-10 23:30:08,106 WARN [jeeves.webapp.csw] - SearchController : Metadata not found or invalid schema : 209 2017-12-10 23:30:08,602 INFO [jeeves.service] - -> dispatching to output for : csw 2017-12-10 23:30:08,602 INFO [jeeves.service] - -> writing xml for : csw 2017-12-10 23:30:11,154 INFO [jeeves.service] - -> output ended for : csw 2017-12-10 23:30:11,154 INFO [jeeves.service] - -> dispatch ended for : csw
Also, I checked and the last update I did in GN was 12/22 at 2:56pm. That would have been right before I headed out for the holiday so did not verify if that showed up on production.
Is there are possibility that with the migration of the servers at about that day/time that there is a permission or ownership that got changed? I'm seeing no problems within GN itself (knock on wood) at this point in time.
@Bobfrat should have pushed some changes to handle more error cases.
There were also a couple bad files for http://tds.glos.us/thredds/dodsC/SM/LakeMichiganSM-Agg.html that were causing the aggregation and associated data access endpoints to not function properly.
As they were causing the aggregation to become corrupted and not function properly, I moved them out of the thredds data directory to a folder /root/MTRI-SM/michigan
You may want to replace them with non-corrupted versions.
These files are for the Lake Michigan suspended minerals are:
4326_201705051900.nc
4326_201705071850.nc
4326_201707311910.nc
Closing this issue and opening a new one for fixing the data corruption issue.
Noticed at DMAC meeting that portal is not displaying obs. Kelly checked and obs cache was empty. Greg and Cheryl verified that this was due to partition on Michigan being at 100% capacity. They cleared off some files and got it back to 85% and obs was now showing data (verified by looking at data on dev).
HOWEVER, @kknee @gcutrell @cheryldmorse I waited awhile since I figured our normal latency might delay things and still have not seen the buoys/stations showing up on the production portal as of 6pm (even though as mentioned above they are showing on dev). Does something need to be rebooted or do we have another issue?