beda-software / fhir-py

FHIR Client for python
MIT License
168 stars 31 forks source link

_build_request_url does not accept port #76

Open BSVogler opened 3 years ago

BSVogler commented 3 years ago

When using paginated requests the returned URL may contain the port. This results in the check in https://github.com/beda-software/fhir-py/blob/4296920de216acc8915a6bbcee581428ce1569cd/fhirpy/base/lib.py#L81

to raise an exception although the base URL is the same. E.g. it crashes with

"https://example.com:443/fhir?_getpages=c4748d92-ff97-48d6-a496-873e85631ec9&_getpagesoffset=600&_count=600&_pretty=true&_bundletype=searchset" does not contain base url "https://example.com/fhir" (possible security issue)
ir4y commented 3 years ago

I think that port should be a part of baseUrl in this case.
If you initialize the client with baseUrl that already contains port is it resolve the issue?

BSVogler commented 3 years ago

Yes, it resolves the issue, but my point is that this should not throw the error when you configure it without the port. Maybe just check for the hostname e.g.

if urllib.parse.urlparse(self.url).netloc == urllib.parse.urlparse(path).netloc
ir4y commented 3 years ago

The strict check was done for security reasons.

Unfortunately netloc will not work. First of all, it keeps port anyway.

>>> urlparse('http://localhost:8080/fhir')
ParseResult(scheme='http', netloc='localhost:8080', path='/fhir', params='', query='', fragment='')

Secondly, we need to check a path as well. In some cases, more than only the FHIR server can be launched on the same domain.

http://example.com -> points to some landing page
http://example.com/fhir -> points to fhir server
http://example.com/admin -> point to 3dparty web app

In case of FHIR server misconfiguration, it is possible that during pagination the request may be sent to the wrong URL that handles with another software. It will cause token exposure.

BSVogler commented 3 years ago

When it shall be really strict we need to have the match of the exact url and derive the default port when not given.


parsed_next = urllib.parse.urlparse(self.url)
if parsed_next.port!='':
    parsed = urllib.parse.urlparse(path)
    if parsed.port == '' and parsed.scheme == "https":
        path=parsed.scheme+'://'+ parsed.netloc+':443'+ parsed.path
if self.url in path: