elixir-cloud-aai / proTES

Proxy service for injecting middleware into GA4GH TES requests
Apache License 2.0
5 stars 6 forks source link

Several issues with TES/input URI processing in distance-based task distribution logic #134

Closed uniqueg closed 1 year ago

uniqueg commented 1 year ago

When working with FTP in Funnel, we need to supply FTP credentials through the URLs. In the current implementation, input and TES URIs/URLs are parsed with urllib.parse.urlparse. Out of the resulting fragments, the netloc is then passed to socket.gethostbyname to get the IP of that host. However, the netloc extracted via urllib.parse.urlparse retains (basic) authorization credentials (e.g., user:password@host.name), but socket.gethostbyname is not able to parse these and throws a socket.gaierror. This exception is caught, but is handled in such a way that the list of input URIs is either incomplete or empty. In the latter case, this leads to an error in pro_tes.middleware.task_distribution.distance.task_distribution because no TES/input URL/URI combinations can be compiled (a situtation that is not handled).

To address this issue fully, the following should be done: