carsonyl / pypac

Find and use proxy auto-config (PAC) files with Python and Requests.
https://pypac.readthedocs.io
Apache License 2.0
71 stars 18 forks source link

Added option to ignore Content-Type header. #75

Open KarelChanivecky opened 12 months ago

KarelChanivecky commented 12 months ago

A kwarg parameter named ignore_content_type was added to get_pac and download_pac. The rationale for adding a new parameter is that it maintain the best backwards compatibility and ease of use. Also considered using accepting a specific value for allowed_content_types. However, I believe that this may be unexpected for current users. The parameter was placed before the 'session' parameter in order to maintain related parameter together. However, it risks breaking usages where all the passed arguments are positional. I leave this last matter in the discretion of the repo maintainer

carsonyl commented 11 months ago

This new feature seems to address a use case where the developer does not know the content type returned by the server and just wants PyPAC to accept any. But why wouldn't the developer just check for the content type and set it in allowed_content_types?

Due to how the content type comparison works, a workaround to needing ignore_content_type could be to set allowed_content_types = ['/'], assuming the server will always return a value containing /.

KarelChanivecky commented 11 months ago

The issue arises when the developer has no control or knowledge of the runtime environment of the application. Reportedly, many clients do not enforce this. For example, see the chromium docs: https://chromium.googlesource.com/chromium/src/+/HEAD/net/docs/proxy.md#Downloading-PAC-scripts

One could require the users of the application to set the mime type. However, it may be desirable to reduce the number of possible configuration conflicts with an application user. Specially, if having the correct mime type is not considered critical.

carsonyl commented 11 months ago

That Chromium doc is another very good discovery. But I want to avoid adding new parameters as much as possible, and this issue of nonstandard content type headers doesn't seem common. So maybe a "good enough" workaround is to set allowed_content_types = ['/']? The code already allows the content-type header to be missing, so this workaround would only fail for the case where the header is present but the value doesn't contain /.

KarelChanivecky commented 11 months ago

I hear you about adding more args. How about add a special token that can be passed in the allowed content types such as "*"?

carsonyl commented 11 months ago

allowed_content_types could be changed to handle list[str]|Literal["*"]. But we could also instead avoid changing the code by adding doc on this topic. What do you think?