Closed mwengren closed 1 year ago
What about we put a block in the code? Say only query a random ERDDAP for testing? Adjust the following portion to only return one, random, server: https://github.com/oceanhackweek/ohw19-project-co_locators/blob/f2d0142f20b91ceee51b8a7453bc78fa1d504a29/colocate/run.py#L26-L43
@ocefpaf recommended some advanced http requests during our presentation.
I took some notes from the questions/comments: -Delay hits to server so it doesn’t get overwhelmed -Parse the actual meaning of all the HTTP errors, is it because the server is being hit too many times? (500: no data or possible server error, 400: unavailable or client issue, 403 Forbidden: bad news!, they don’t want us) -If data repeated, don’t get again but return already have -erddapy might be erroneous? -SUBMIT REQUESTS upstream to get things fixed
Pretty much what @mollyjames mentioned above. Here are some links to help:
retrying
technique for the transient error.async
execution before diving into parallelism.lru_cache
to avoid redundant hits.PS0: I'm not an async
export but I believe that, once it is implement, it can be expanded to full parallelism.
PS1: Some of those error may be erddapy
bad URL, if you find any please raise upstream!
Thanks everyone for the input!
I ran a few more tests this weekend and noticed some slowness from a few of the ERDDAPs, in particular the primary Coastwatch one.run by Bob. My first thought was that there had been a rate limit set or something. I reached out to Bob to ask if he had had to block our user-agent, but he said he hadn't, although usage was somewhat high last week.
Anyway, I have a PR that helps this a little that I'll probably merge if no one opposes. It simplifies the query code a bit for the feature extraction, and also drops the Coastwatch ERDDAP from the list for testing.
Picking a few servers at random might help as well, but it would need to be more than just one or we would never get any results.
One thing to note about the 'demo'. It wasn't really clear while we were up there, but a lot of the HTTP status code 500s and 400s were expected responses. This is what ERDDAP returns if it doesn't find any data based on our query params. We should have dropped the logging for the demo but didn't quite get to it.
The bigger issue in terms of server load is grabbing the actual points in get_coordinates.
I think @ocefpaf found that when he switched from:
e = ERDDAP(server="https://data.ioos.us/gliders/erddap")
to:
e = ERDDAP(server="http://erddap.sensors.ioos.us/erddap")
that the demos went much smoother.
@kwilcox, do you want to chime in here?
Has Axiom done any server side enhancements/configuration to make ERDDAP perform well for these use cases?
Actually, https://data.ioos.us/gliders/erddap is faster than http://erddap.sensors.ioos.us/erddap now, but not for the reason we expected :smile:
https://nbviewer.jupyter.org/gist/ocefpaf/e7eaf0f4ea0e475b20b2fe935a880c00
Apparently that dataset no longer exists in that server :-/
@rsignell-usgs No special enhancements to ERDDAP, the ERDDAP, Version 1.82_axiom-r1
is just ERDDAP 1.82 with some extra dataset classes and build process improvements that shouldn't effect performance. We had (do have) issues with the Glider DAC ERDDAP server going down and being slow so we harvest all of the data and serve it back up for some reliability. Sorry for thread hijack!
Apparently that dataset no longer exists in that server :-/
Update that is relevant for the project here but probably deserves a new issue. Multiple servers have different dataset_id
s for the same dataset. B/c that is not unique it will be hard to remove duplicated returns when searching multiple servers (the goal of this project!).
On a related note, here is the fixed notebook, with the -
and _
fixed:
https://nbviewer.jupyter.org/gist/ocefpaf/47d8465233de80c206826243ce8accb3
The responses is 3.54 s vs 8.33 s! That is quite a lot!!
Closing this issue. Will not fix/stale.
I think we'd be wise to change some of our ERDDAP query code from what it is now to go easier on the ERDDAPs out there before we do much more testing of this, now that the week is over. I thought I saw some signs of possible agitated ERDDAP admins out there...
Please refrain from doing too many tests runs of what we have now until we can make it a little 'friendlier'.