kevinsteves / pan-python

Multi-tool set for Palo Alto Networks PAN-OS, Panorama, WildFire and AutoFocus
Other
267 stars 102 forks source link

use urlopen context manager to handle connection close in a timely manner #47

Closed jubrad closed 1 year ago

jubrad commented 2 years ago

What it does:

More explicitly closes the xapi response object to avoid downstream circular reference or other potential issues where garbage collector does not promptly clean the request and close the connection.

Background:

I'm noticing many "Session Timed Out" errors while using local-execs in terraform to configure some resources that the tf provider cannot. At first I tested to see what rate I would hit this bug, but was not able to with >6k requests happening serially (creating and deleting the same dns proxy static entries). Since terraform runs with 10 parallel threads walking its tree I decided to test some parallel calls and quickly was able to rerpro. ~20-40 threads making 4 entries each.

I believe this relates to terraform-provider-panos/issues/255, at least it does for me, but it's possible there's something wonky going on with the PA server here.

How it was tested:

Using a python script that adds and removes dns-proxy static entries I validated that with ~20 processes each making 8 entries as fast as they can the python client would regularly produce "Session Timed Out" errors. My goal of this test was to see if I could increase the rate of collision or push the server beyond its concurrent open request limits. These "errors" would produce a valid http response to the urlopen response object with the following error: b'<response status="unauth" code="22"><msg><line>Session timed out</line></msg></response>' To me, this indicates that there's something weird going on on the server side, potentially to do with auth, but probably not due to any tcp / request issues.

When using the context manager I could not reproduce this error in python, even when increasing the concurrent processes to 40. Although... the terraform provider does give me a session timed out so there's still more for me to figure out.

Not the best solution, but it's what I have time for.

devbollinger commented 2 years ago

Hello guys, we do have the same error and looking for a way to at least close the connection manually. The code in the commit looks very interesting and would probably help. Is it going to be merged?

kevinsteves commented 2 years ago

This fix looks good, resulting in response.closed is True when the context manager block is exited. Is there a reason we can't simplify the urlopen() part to be just:

            with urlopen(**kwargs) as response:
                response.body = response.read()
devbollinger commented 1 year ago

Seems good to me

kevinsteves commented 1 year ago

Modified version of https://github.com/kevinsteves/pan-python/pull/47 from Justin Bradfield merged.

jubrad commented 1 year ago

This fix looks good, resulting in response.closed is True when the context manager block is exited. Is there a reason we can't simplify the urlopen() part to be just:

            with urlopen(**kwargs) as response:
                response.body = response.read()

@kevinsteves I think I did this to log a few things during debug. I then removed the log messages and didn't put the code back to the simpler form. Thanks for merging.

jubrad commented 1 year ago

Hello guys, we do have the same error and looking for a way to at least close the connection manually. The code in the commit looks very interesting and would probably help. Is it going to be merged?

@devbollinger make sure that you're not exceeding the max threads of the PA server... There was a longer discussion in the pango repo about that. This is especially applicable in the terraform provider, if have parallelization > 4 then you will hit timeout issues to some extent.