Ease the ERDDAP server querying ASAP

mwengren commented 5 years ago

I think we'd be wise to change some of our ERDDAP query code from what it is now to go easier on the ERDDAPs out there before we do much more testing of this, now that the week is over. I thought I saw some signs of possible agitated ERDDAP admins out there...

Please refrain from doing too many tests runs of what we have now until we can make it a little 'friendlier'.

MathewBiddle commented 5 years ago

What about we put a block in the code? Say only query a random ERDDAP for testing? Adjust the following portion to only return one, random, server: https://github.com/oceanhackweek/ohw19-project-co_locators/blob/f2d0142f20b91ceee51b8a7453bc78fa1d504a29/colocate/run.py#L26-L43

MathewBiddle commented 5 years ago

@ocefpaf recommended some advanced http requests during our presentation.

mollyjames commented 5 years ago

I took some notes from the questions/comments: -Delay hits to server so it doesn’t get overwhelmed -Parse the actual meaning of all the HTTP errors, is it because the server is being hit too many times? (500: no data or possible server error, 400: unavailable or client issue, 403 Forbidden: bad news!, they don’t want us) -If data repeated, don’t get again but return already have -erddapy might be erroneous? -SUBMIT REQUESTS upstream to get things fixed

ocefpaf commented 5 years ago

Pretty much what @mollyjames mentioned above. Here are some links to help:

Try some sort of retrying technique for the transient error.
Explore the possibility of async execution before diving into parallelism.
Re-factor all the server calls to a minimum function and use lru_cache to avoid redundant hits.

PS0: I'm not an async export but I believe that, once it is implement, it can be expanded to full parallelism.

PS1: Some of those error may be erddapy bad URL, if you find any please raise upstream!

mwengren commented 5 years ago

Thanks everyone for the input!

I ran a few more tests this weekend and noticed some slowness from a few of the ERDDAPs, in particular the primary Coastwatch one.run by Bob. My first thought was that there had been a rate limit set or something. I reached out to Bob to ask if he had had to block our user-agent, but he said he hadn't, although usage was somewhat high last week.

Anyway, I have a PR that helps this a little that I'll probably merge if no one opposes. It simplifies the query code a bit for the feature extraction, and also drops the Coastwatch ERDDAP from the list for testing.

Picking a few servers at random might help as well, but it would need to be more than just one or we would never get any results.

One thing to note about the 'demo'. It wasn't really clear while we were up there, but a lot of the HTTP status code 500s and 400s were expected responses. This is what ERDDAP returns if it doesn't find any data based on our query params. We should have dropped the logging for the demo but didn't quite get to it.

The bigger issue in terms of server load is grabbing the actual points in get_coordinates.

rsignell-usgs commented 5 years ago

I think @ocefpaf found that when he switched from:

e = ERDDAP(server="https://data.ioos.us/gliders/erddap")

to:

e = ERDDAP(server="http://erddap.sensors.ioos.us/erddap")

that the demos went much smoother.

@kwilcox, do you want to chime in here?
Has Axiom done any server side enhancements/configuration to make ERDDAP perform well for these use cases?

ocefpaf commented 5 years ago

Actually, https://data.ioos.us/gliders/erddap is faster than http://erddap.sensors.ioos.us/erddap now, but not for the reason we expected :smile:

https://nbviewer.jupyter.org/gist/ocefpaf/e7eaf0f4ea0e475b20b2fe935a880c00

Apparently that dataset no longer exists in that server :-/

kwilcox commented 5 years ago

@rsignell-usgs No special enhancements to ERDDAP, the ERDDAP, Version 1.82_axiom-r1 is just ERDDAP 1.82 with some extra dataset classes and build process improvements that shouldn't effect performance. We had (do have) issues with the Glider DAC ERDDAP server going down and being slow so we harvest all of the data and serve it back up for some reliability. Sorry for thread hijack!

ocefpaf commented 5 years ago

Apparently that dataset no longer exists in that server :-/

Update that is relevant for the project here but probably deserves a new issue. Multiple servers have different dataset_ids for the same dataset. B/c that is not unique it will be hard to remove duplicated returns when searching multiple servers (the goal of this project!).

On a related note, here is the fixed notebook, with the - and _ fixed:

https://nbviewer.jupyter.org/gist/ocefpaf/47d8465233de80c206826243ce8accb3

The responses is 3.54 s vs 8.33 s! That is quite a lot!!

mwengren commented 1 year ago

Closing this issue. Will not fix/stale.

ioos / colocate

Ease the ERDDAP server querying ASAP #5