lifewatch / etn-otn-exchange

European Tracking Network (ETN) and Ocean Tracking Network (OTN) data exchange issues
4 stars 2 forks source link

Grabbing ETN detections process #19

Closed diniangela closed 2 years ago

diniangela commented 2 years ago

I noticed if the ETN detections are grabbed from Geoserver CSV's, they only only show 1 million of the detections. I have been trying pagination with WFS but it has proven to be more difficult than I initially thought.

Is there a better way to grab the ETN detections from ETN in their entirety?

@jdpye @naomitress

aubrivliz commented 2 years ago

Hi angela, the best way to do this is indeed with the WFS geoJSON format and use the paging. It is some coding work but feasible.

aubrivliz commented 2 years ago

There are currently 553 320 771 detections available in the layer.

diniangela commented 2 years ago

Ok sounds good, ya think I can do 1 million detections at a time so might need more paginations

Also when I was testing the pagination, geoserver kicked me out after a couple of tries
Would it be possible to open up the number of hits for logged in users?

jdpye commented 2 years ago

Looks like there's a # of queries limit on the GeoServer. Could be there's something default being set in this way: https://docs.geoserver.org/latest/en/user/extensions/controlflow/index.html#per-user-rate-control but if it's something else, we'd still need to lift it in order to do ~550 queries every time we do a detections cross-match.

jdpye commented 2 years ago

I was able to grab 150m or so detections over 150 queries yesterday, haven't checked over them but after 150m i got a 0-byte return from the GeoServer, which may mean i had run out of pages, or may mean that i hit the rate-limit block.

naomitress commented 2 years ago

closing in favour of https://github.com/lifewatch/etn-otn-exchange/issues/17