nens / lizard-catalogue

Lizard Catalogue
https://demo.lizard.net/catalogue/
0 stars 0 forks source link

Remove large page_size query in API calls #280

Closed hoanphungt closed 3 years ago

hoanphungt commented 3 years ago

Similar to the lizard-management-client, it would be good to remove large page_size queries (page_size=0 or page_size > 1000).

Especially, the one in the timeseries selection modal really has performance issue and sometimes leads to 504 - Gateway Timeout as found in here: https://github.com/nens/lizard-catalogue/issues/246.

Places where changes have been made:

Monitoring network endpoint:

hoanphungt commented 3 years ago

@lexvand While working on the issue with large page_size in the lizard management client, I think it would be good to also implement the solution there to the lizard catalogue. Then hopefully we can solve some of the performance issue especially with the timeseries selection modal.

I want to pick this issue up during this sprint. What do you think?

lexvand commented 3 years ago

@hoanphungt Yes, please do so. Can you add an estimate?

hoanphungt commented 3 years ago

@Derryrover Could you review this PR: https://github.com/nens/lizard-catalogue/issues/281?

I am not done yet as there are still certain things I want to address but I think it's good for you to review it now.

My main concern now is that for the large response. for example when you go to the Timeseries Selection modal of the N&S large sample monitoring network, it is gonna load more than 5000 timeseries so instead of sending 1 request with page_size=10000, it is now sending > 50 requests with page_size=100. In this case, I am not sure which one is better anymore. I asked Carsten and he also thought 50 requests are a lot and maybe we need to find a common ground between page size 100 and page size 10000.

Another concern of mine is that the above 50 requests will be sent to the server until it is completely finished, no matter if at that time, I closed the modal as maybe I don't want to wait anymore and would like to check out other things. I think this is unnecessary but couldn't find an easy way to stop it yet. What do you think about this issue?

hoanphungt commented 3 years ago

@Derryrover Hi Tom,

I am thinking to refactor this to use react-query as discussed with Remco last meeting. Still, I would like to hear your opinion about the matters that I stated in the previous comment:

My main concern now is that for the large response. for example when you go to the Timeseries Selection modal of the N&S large sample monitoring network, it is gonna load more than 5000 timeseries so instead of sending 1 request with page_size=10000, it is now sending > 50 requests with page_size=100. In this case, I am not sure which one is better anymore. I asked Carsten and he also thought 50 requests are a lot and maybe we need to find a common ground between page size 100 and page size 10000.

Another concern of mine is that the above 50 requests will be sent to the server until it is completely finished, no matter if at that time, I closed the modal as maybe I don't want to wait anymore and would like to check out other things. I think this is unnecessary but couldn't find an easy way to stop it yet. What do you think about this issue?

Kr, Hoan

Derryrover commented 3 years ago

@hoanphungt

Hi Hoan Yes excellent that you investigate react query

it is gonna load more than 5000 timeseries so instead of sending 1 request with page_size=10000, it is now sending > 50 requests with page_size=100. In this case, I am not sure which one is better anymore.

I cannot tell you which one is better, we (or backend) could maybe run a test to determine that. I also think this problem has to simple solution right now and may require some backend work. Worth to consider:

Another concern of mine is that the above 50 requests will be sent to the server until it is completely finished,

The paginated requests will be send in series (not parallel). At each given time only one request is underway/busy. You might investigate if there is a way to stop the upcoming requests from being send if the user closes the modal. How to do this depends on the technology used (hooks vs recursive vs redux). I am not sure how to do it with hooks. Maybe you could investigate it further and show Remco and me what you found. I am also interested how react query will work together with paginated fetch. Are they compatible?

kr Tom

hoanphungt commented 3 years ago

Investigating to use react-query instead of the usePaginatedFetch custom hook

hoanphungt commented 3 years ago

Solve some of the large queries using page_size=0 or page_size=100000 in the catalogue. For example:

Also use a function called paginatedFetch to make the page_size smaller and to build the whole records of the object based on the next parameter.

The page_size queries are adjusted to a reasonable number in multiple places but not to be completely removed as in some cases, it's better to send 1 request with large page_size query than > 50 requests with small page_size query.

This is already a good improvement for the catalogue as the issue of sometimes a very large page_size query cannot be resolved is not happening anymore.

hoanphungt commented 3 years ago

@Derryrover Thanks for your comment. I made several tickets based on your comment.

lexvand commented 3 years ago

I tested the changes and for those endpoints that require multiple requests it works good. The ones that cannot be tested because of lack of paginated results are expected to also work properly, like we've seen in the management client. Nice work!