HydrologicEngineeringCenter / cwms-python

Corps Water Management Systems (CWMS) library utilizing CWMS Data API
MIT License
6 stars 4 forks source link

Reduce Default Pagesize to 500 (default) or a little higher from 500000 #87

Closed krowvin closed 2 weeks ago

krowvin commented 1 month ago

If users try to request a page-size of 500,000 for timeseries I believe the national instance would timeout.

https://github.com/HydrologicEngineeringCenter/cwms-python/blob/main/cwms/timeseries/timeseries.py#L41

In my opinion this should be a small value for the default and let users increase it at their own risk if they want to try that.

Otherwise if leaving it as 500,000 you might consider implementing pagination within the logic.

There also used to be a page-size=-1 but that was in a previous version @danielTosborne wrote. Not sure if it's still valid!

I.e. something like this (this is tsids) https://github.com/krowvin/react-demo/blob/bf480a2ee81789819b5cb7ae6cac7aa272df02e6/src/TSDropdown.jsx#L36

Enovotny commented 1 month ago

the only reason it times out is because it is on the DMZ. The CWBIs instances do not have restrictions like the DMZ does. We will soon point the national instance to CWBI. But I agree some paging should be looked into.

krowvin commented 1 month ago

~This page-size will work on CWBI TEST and local district T7s.~

~It will fail on the national.~

You said this as I was typing it!

Slim chance users are targeting the national at this point for cwms-python.

But public entities might like to use this library (before it migrates)

Enovotny commented 1 month ago

I tested this and reduced page sizes drastically reduce performance. Tested by having various page sizes getting 200 days of data ~19,200 values

page size 500 = 290 secs page size 1,000 = 176 secs page size 2,000 = 66 sec page size 5,000 = 27 sec page size 10,000 = 24 sec page size 20,000 = 14 sec page size 50,000 = 14 sec ++ = 14 sec

having a higher page size doesn't impact DMZ since the DMZ will restrict data transfer no matter what page size is set to. I will add paging though for much larger data pulls so it grabs all of the data, but will leave page size as a large number.