Closed lukecampbell closed 6 years ago
@lukecampbell The harvest reports on the Service Monitor are looking better. I like the color coding. Nice job.
I don't think I understand completely the harvesting workflow you set up in ioos/service-monitor#460, and that's OK, but looking this morning I noticed that each of the NDBC stations reports the same error now (one example). For instance here and here the harvest results seem identical.
The harvest should ideally be doing different things for different stations within the same service (in the case of i52N anyway, maybe not ncSOS because obviously there's only one 'station' in each ncSOS service). The GetCapabilities request will be the same, but the DescribeSensor will be different (depending on the station ID).
Is this happening currently, or does each SOS 'harvest' try to harvest each of the stations listed in the GetCaps response and then report on all stations harvested in the results page (no matter which station you click in the services list?
I'm guessing changing this would be a big refactor if I'm right about how it works. We probably want to avoid that, but maybe there's room for some small improvements...
Is this happening currently, or does each SOS 'harvest' try to harvest each of the stations listed in the GetCaps response and then report on all stations harvested in the results page (no matter which station you click in the services list?
Yes that's how it works, it gets each station in the GetCaps for a service. A service used to be defined as a unique URL but we've changed that so that you see one service for each resource defined in CKAN.
It would certainly be a massive undertaking to treat each service as a 1:1 with each station and it would likely not succeed since it depends on naming conventions of providers which are reliably inconsistent.
Imagine MARACOOS publishes a fictitious 52n service with three stations. Station Alpha, Station Bravo, and Station Charlie.
In 52n they are identified urn:ioos:station:maracoos:alpha
etc. But in each ISO Record the title of the station is Station Alpha in the Mid Atlantic
. Even if there was a place in ISO to stick the URN identifier, I can't imagine it's very reliable to depend on based on the wide variety of ISOs I see and how they are generated.
What I can tell from that ISO record is that there is a 52n service somewhere. And what I can tell from 52n is that there are a bunch of stations, it's hard for me to associate a particular station in a GetCapabilities response from 52n to a particular ISO document and therefore to the service of an ISO document.
I think we would need to redefine service, harvests and datasets to make what you describe a reality. And starting from near scratch is probably easier than refactoring the current service-monitor. It has a lot of components that are really hard for me to refactor, like it's dependency on paegan and dogma.
Yeah, I thought it would involve a lot of changes. Just wanted to confirm. I don't think it's worth re-writing everything, most of the harvesting results look a lot better right now.
With just a few exceptions CO-OPS and NDBC.
Ideally, we could find some way to have these SOS services harvest properly, but it looks like internal errors with a few of their stations are going to prevent that from ever happening, and the counts will always be '0 of XX'. I know from writing sensorml2iso that NDBC has some DescribeSensor requests that always fail, and I'm not sure what's going on with CO-OPS and that failure message.
Probably the solution would be to reach out each of those providers to fix their services. Let's keep this open as a reminder, but are we ready make an official 'release' of this updated Monitor? Or did you make a release already this week?
I could change the logic in the harvester to treat "partial" success as a 1 in terms of counting.
No, that's OK. We should really get the underlying issue resolved with the provider. What about the release question, did we do an official 'release' this week?
If not, can you tag a new release on GitHub? That way I can say we met the May 5 deadline to update SM. Thanks.
I see this: https://github.com/ioos/service-monitor/releases/tag/3.3.1, but there were other changes you made subsequent to that right? Can we roll those into a new release just to mark that milestone as done?
There was an uncaught exception in some code paths for the service monitor harvesters addressed here: https://github.com/ioos/service-monitor/pull/464
This was causing the harvesters to crash in some instances. It is now fixed, but I'll need to review this issue some more prior to addressing the problems here.
@mwengren, what do we want to do with this issue following the discussion on yesterday's call?
Closing this issue as we're not going to make these changes at this point.
Related Service Monitor discussion issue: https://github.com/ioos/catalog/issues/60.
Right now the service monitor indicates well over 80% services as down, offline or errors for SOS service endpoints.
Some services are legitimately down, but I think we are creating a denial of service situation by asking for individual DescribeSensor requests for every offering.