ncssar / sartopo_python

Python calls for the caltopo / sartopo API
GNU General Public License v3.0
14 stars 2 forks source link

delete request slowness #34

Closed caver456 closed 1 month ago

caver456 commented 2 years ago

Why does this take so friggin long in sartopo_python?

Here's the trascript of cleanUG1.py - you can see how long each response takes from the timestamps - the first one takes two seconds, the next takes four seconds, the final ones (there were probably 30 items or so) took 4-7 seconds each:

2021-09-15 08:50:40,120 [INFO 2] sending delete to http://localhost:8080/api/v1/map/UG1/Shape/cb4d7332-285c-4eae-8d0b-69e57598f5d0
2021-09-15 08:50:40,120 [INFO 2] SENDING DELETE to 'http://localhost:8080/api/v1/map/UG1/Shape/cb4d7332-285c-4eae-8d0b-69e57598f5d0':
2021-09-15 08:50:40,120 [INFO 2] {}
2021-09-15 08:50:42,256 [INFO 2] URL:http://localhost:8080/api/v1/map/UG1/Shape/cb4d7332-285c-4eae-8d0b-69e57598f5d0
2021-09-15 08:50:42,256 [INFO 2] response:<Response [200]>
2021-09-15 08:50:42,267 [INFO 2] rj:{'result': {'id': 'cb4d7332-285c-4eae-8d0b-69e57598f5d0', 'type': 'Feature', 'properties': {'creator': '3MA660', 'stroke-opacity': 0.4, 'description': '', 'stroke-width': 8, 'title': 'AA 101', 'fill': '#FF0000', 'class': 'Shape', 'updated': 0, 'stroke': '#FF0000', 'fill-opacity': 0, 'folderId': 'c791db8b-0118-4865-87c3-76e2537c1132', 'gpstype': 'TRACK'}}, 'status': 'ok', 'timestamp': 1631721042255}
2021-09-15 08:50:42,268 [INFO 2] sending delete to http://localhost:8080/api/v1/map/UG1/Shape/ccaf3384-624c-43ca-ac3d-9248099b9b59
2021-09-15 08:50:42,268 [INFO 2] SENDING DELETE to 'http://localhost:8080/api/v1/map/UG1/Shape/ccaf3384-624c-43ca-ac3d-9248099b9b59':
2021-09-15 08:50:42,268 [INFO 2] {}
2021-09-15 08:50:46,380 [INFO 2] URL:http://localhost:8080/api/v1/map/UG1/Shape/ccaf3384-624c-43ca-ac3d-9248099b9b59
2021-09-15 08:50:46,380 [INFO 2] response:<Response [200]>
2021-09-15 08:50:46,380 [INFO 2] rj:{'result': {'id': 'ccaf3384-624c-43ca-ac3d-9248099b9b59', 'type': 'Feature', 'properties': {'creator': '3MA660', 'stroke-opacity': 0.4, 'description': '', 'stroke-width': 8, 'title': 'AD', 'fill': '#FF0000', 'class': 'Shape', 'updated': 0, 'stroke': '#FF0000', 'fill-opacity': 0, 'folderId': 'd5684d33-5fe8-4ec3-b6b6-70955b252e8a', 'gpstype': 'TRACK'}}, 'status': 'ok', 'timestamp': 1631721046366}
2021-09-15 08:50:46,380 [INFO 2] sending delete to http://localhost:8080/api/v1/map/UG1/Shape/4dee6269-c669-47d3-aa51-d9d2d61e0259

... ... ...

021-09-15 08:52:11,547 [INFO 2] sending delete to http://localhost:8080/api/v1/map/UG1/Folder/33cce1ec-88ca-4597-b252-fdffae9b8106
2021-09-15 08:52:11,547 [INFO 2] SENDING DELETE to 'http://localhost:8080/api/v1/map/UG1/Folder/33cce1ec-88ca-4597-b252-fdffae9b8106':
2021-09-15 08:52:11,547 [INFO 2] {}
2021-09-15 08:52:18,932 [INFO 2] URL:http://localhost:8080/api/v1/map/UG1/Folder/33cce1ec-88ca-4597-b252-fdffae9b8106
2021-09-15 08:52:18,933 [INFO 2] response:<Response [200]>
2021-09-15 08:52:18,933 [INFO 2] rj:{'result': {'id': '33cce1ec-88ca-4597-b252-fdffae9b8106', 'type': 'Feature', 'properties': {'creator': '3MA660', 'visible': True, 'title': 'AE', 'class': 'Folder', 'updated': 0}}, 'status': 'ok', 'timestamp': 1631721138931}
2021-09-15 08:52:18,933 [INFO 2] sending delete to http://localhost:8080/api/v1/map/UG1/Folder/5bf01df3-694f-4c3d-942e-c5120ecde3dc
2021-09-15 08:52:18,933 [INFO 2] SENDING DELETE to 'http://localhost:8080/api/v1/map/UG1/Folder/5bf01df3-694f-4c3d-942e-c5120ecde3dc':
2021-09-15 08:52:18,933 [INFO 2] {}
2021-09-15 08:52:24,498 [INFO 2] URL:http://localhost:8080/api/v1/map/UG1/Folder/5bf01df3-694f-4c3d-942e-c5120ecde3dc
2021-09-15 08:52:24,498 [INFO 2] response:<Response [200]>
2021-09-15 08:52:24,498 [INFO 2] rj:{'result': {'id': '5bf01df3-694f-4c3d-942e-c5120ecde3dc', 'type': 'Feature', 'properties': {'creator': '3MA660', 'visible': True, 'title': 'AC 123', 'class': 'Folder', 'updated': 0}}, 'status': 'ok', 'timestamp': 1631721144483}

and this shows up in the CTD console each time:

RequestPropertyFilter handling uncaught exception:
java.lang.NullPointerException
        at org.sarsoft.common.request.APIV1MapRequestDispatcher.dispatch(APIV1MapRequestDispatcher.java:53)
        at org.sarsoft.common.request.APIV1MapRequestDispatcher.dispatch(APIV1MapRequestDispatcher.java:99)
        at org.sarsoft.DispatchController.map(DispatchController.java:29)
        at jdk.internal.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:564)
        at org.springframework.web.bind.annotation.support.HandlerMethodInvoker.invokeHandlerMethod(HandlerMethodInvoker.java:176)
        at org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.invokeHandlerMethod(AnnotationMethodHandlerAdapter.java:440)
        at org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.handle(AnnotationMethodHandlerAdapter.java:428)
        at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:919)
        at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:851)
        at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:953)
        at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:844)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:104)
        at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:829)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:45)
        at winstone.ServletConfiguration.execute(ServletConfiguration.java:249)
        at winstone.RequestDispatcher.forward(RequestDispatcher.java:335)
        at winstone.RequestDispatcher.doFilter(RequestDispatcher.java:378)
        at org.sarsoft.common.request.RequestPropertyFilter.doFilter(RequestPropertyFilter.java:69)
        at winstone.FilterConfiguration.execute(FilterConfiguration.java:195)
        at winstone.RequestDispatcher.doFilter(RequestDispatcher.java:368)
        at org.sarsoft.compatibility.SQLiteDBSpringSessionFilter.doFilter(SQLiteDBSpringSessionFilter.java:27)
        at winstone.FilterConfiguration.execute(FilterConfiguration.java:195)
        at winstone.RequestDispatcher.doFilter(RequestDispatcher.java:368)
        at winstone.RequestDispatcher.forward(RequestDispatcher.java:333)
        at winstone.RequestHandlerThread.processRequest(RequestHandlerThread.java:244)
        at winstone.RequestHandlerThread.run(RequestHandlerThread.java:150)
        at java.base/java.lang.Thread.run(Thread.java:832)

BUT, this CTD terminal output doesn't happen when the deletion is done from the GUI. Conclusion: the code is doing something wrong. Need to make sure the headers are as-expected.

Here's the chrome request header tab when deleting a folder from the GUI, which goes fast:

image

caver456 commented 2 years ago

Noticed that the first NPE happens on the initial dummy-request during session startup, which was used to confirm the API version.

Turns out that all NPEs go away if you skip the initial dummy-request, and replace it with these hardcodes:

        # try these hardcodes, instead of the above dummy-request, to see if it avoids the NPE's
        self.apiVersion=1
        self.apiUrlMid="/api/v1/map/[MAPID]/"

This does kind of ring a bell, that Matt may have said a year or more ago that the dummy-request is no longer required.

Regarding delete performance / response time, it's not clear whether this increases performance, but, it does get rid of all the NPEs in the CTD terminal, which is probably a good thing. Opening a separate ticket to pursue the skip-the-dummy-request-to-avoid-NPEs topic. Leaving this ticket open to investigate delete performance.

From a browser, bulk ops --> delete works by sending off multiple requests back-to-back, apparently without waiting for responses. But, it's still one feature per delete request. Investigate this option, maybe writing a new wrapper function 'deleteFeatures' (plural) in the same manner as getFeatures vs getFeature. We would still presumably want to block until all related responses have arrived.

caver456 commented 2 years ago

used the above hardcodes in recent commit, which addresses #35; still not sure if it addresses this slowness issue

caver456 commented 2 years ago

Looks like bulk ops sends all requests asynchronously / non-blocking, but the code sends them synchronously. The difference can be huge - for 6 shapes, bulk ops finishes in <2sec, but code takes 54 seconds. That's right - >25:1 for just 6 requests.

See the requests docs: https://docs.python-requests.org/en/v2.0-0/user/advanced/#blocking-or-non-blocking

caver456 commented 1 month ago

Starting to look into this isue again after a few years, since other high priority issues that are prerequsites for a pypi spin have been fixed.

This issue could be a bit more complex, so, it's gray area as to whether fixing this issue would be a big enough benefit to justify holding up the pypi spin. But, now's the time to investigate it some more.

From the more recent version of the request docs, at https://requests.readthedocs.io/en/latest/user/advanced/#blocking-or-non-blocking:

If you are concerned about the use of blocking IO, there are lots of projects out there that combine Requests with one of Python’s asynchronicity frameworks. Some excellent examples are requests-threads, grequests, requests-futures, and httpx.

Which one to use? Are there others? Here's a good comparison of several options: https://www.zenrows.com/alternative/python-requests#grequests

And a detailed comparison of aiohttp and httpx: https://oxylabs.io/blog/httpx-vs-requests-vs-aiohttp

So far, grequests looks like the lightest-weight drop-in solution, and probably the easiest to try. But grequests' dependence on monkeypatching (rewriting of functions in the requests module?) seems inherently risky, since sartopo_python is built around, and reliant on, the requests module. (Maybe someday migration from the requests module to httpx could be something to look into.)

It's probably a good idea to look at the grequests code to at least get a quick understanding of its monkeypatching.

caver456 commented 1 month ago

grequests itself is very small, but it has gevent as a dependency. gevent is the thing that does the monkeypatching. gevent has a long version history and apparently is actively maintained (latest version is Feb 2024), so its robustness might be more trustworthy. It's still in the gray area due to the concept of monkeypatching, but, probably worth a try. Full QA, from the top, would be needed. If it doesn't work out, then it might be time to migrate to httpx which is probably a larger task, meaning that a pypi spin should happen first.

caver456 commented 1 month ago

Got a working test with grequests.

This thread (even though it's 11 years old) brings up an interesting point:

https://stackoverflow.com/questions/16015749

grequests.map on its own is still blocking (the test results don't really show a big speedup, but, they do delete several markers 'at the same time'). It just allows for concurrent sending of multiple requests.

The accepted answer shows non-blocking behavior, but, only for one request. Tried this line in a loop

        grequests.send(grequests.delete(r['url'],params=r.get('params',{})),grequests.Pool(1))

and the return was indeed 'instant', but, the markers did not delete. Not sure why, and, based on the last answer on that thread, it's probably not worth trying grequests any more:

the 'right' way to do this is probably just to use the built-in thread capability, and not to use any additional modules. While the grequests API may be nice and 'lightweight', 'pip install grequests' had quite a few big dependencies.

Will do a commit with the grequests blocking proof of concept, but, will shift to 'simple' threading instead.

caver456 commented 1 month ago

The 'Gen 2' section of this blog page has a more detailed way to use threading:

https://blog.jonlu.ca/posts/async-python-http

caver456 commented 1 month ago

Changed course again to a possible even better and lighter solution: asynchronous programming, rather than threading. See the recommendations at this post:

https://stackoverflow.com/questions/61924004

The 'best' implementations seem to import aiohttp, but that's a non-lightweight non-builtin module; it's also an alterntive to httpx, so it would be an option as a bigger scope: migrate from requests to httpx, or migrate from requests to aiohttp.

So, went looking for 'asynchronous python requests without aiohttp' and based a good working solution from the code at the bottom of this post - even though it's many years old. Actually it seems to use threading AND asyncio:

https://medium.com/hackernoon/how-to-run-asynchronous-web-requests-in-parallel-with-python-3-5-without-aiohttp-264dc0f8546

Committing a working solution now. Needs testing etc but seems to work pretty well.

See this post too, which seems related (they both use ThreadPoolExecutor):

https://stackoverflow.com/questions/62007674

caver456 commented 1 month ago

While adding the delMarkers function (figured we should have delMarkers, if we already have delFeatures and delFeature and delMarker), noticed a few things about arguments to the del... functions:

This probably only applies to the del... functions, but, didn't check to see what other functions it could apply to. Probably an opportunity for refactor / API cleanup, but, will only tackle the del... functions in this issue and this branch.

caver456 commented 1 month ago

allow first arguments of all del... functions to be entire feature objects, or IDs; tested all argument arrangements; all cases are passing.

Note that even the synchronous calls (delMarker, delFeature) appear to be much faster than they were when this issue was first created; but, that's OK; the async calls stil lhave a definite visible speedup on the time it takes to send each request, judging from the log output.

# test all argument configurations to all del... functions
# 1a. delMarker(feature)
# 1b. delMarker(id)
# 2a. delMarkers(features)
# 2b. delMarkers(ids)
# 3a. delFeature(feature)
# 3b. delFeature(id)
# 3b1. delFeature(id,class) (correct class)
# 3b2. delFeature(id,class) (incorrect class)
# 4a. delFeatures(features)
# 4b. delFeatures(idAndClassList)