ruby-rdf / sparql-client

SPARQL client for Ruby.
http://rubygems.org/gems/sparql-client
The Unlicense
112 stars 58 forks source link

Can't access SSL endpoints without port #29

Closed artob closed 9 years ago

artob commented 11 years ago

Originally reported by ekolvets:

Due to Net::HTTP's dependence on URI, SPARQL::Client connections to https endpoints fail because Net::HTTP::Persistent attempts to connect to port 80 instead of port 443.

URI.parse('https://dbpedia.org/').port # returns 443
Addressable::URI.parse("https://dbpedia.org/").port # returns nil

# queries with this wil fail
sparql = SPARQL::Client.new("https://dbpedia.org/sparql")

# queries with this  will succeed
sparql = SPARQL::Client.new("https://dbpedia.org:443/sparql")

Not sure if this is a bug, but its something users should be aware of. The root of the issue is inside of Net::HTTP::Persisent where uri.address and uri.port are passed in to Net::HTTP.new(). Net::HTTP documents its dependence on URI, so I'm not sure if Net::HTTP::Persistent should change to work with Addressable::URIs or whether SPARQL::Client should be passing in a URI instead of an Addressable::URI.

artob commented 11 years ago

Thanks for reporting this. I've confirmed the issue per your description. Attempting to access an HTTPS endpoint without explicitly specifying 443 as the port results in the following error from Net::HTTP::Persistent:

OpenSSL::SSL::SSLError: SSL_connect returned=1 errno=0 state=SSLv2/v3 read server hello A: unknown protocol
joerixaop commented 9 years ago

The issue is that neither the current RDF::URI nor the older Addressable::URI give a port when no explicit port has been provided. The ruby stdlib URI on the other hand always returns a port (either the default one of the scheme or an explicit one).

But the real question is why the sparql client using RDF::URI for interaction with the ruby stdlib, when it could just use a normal URI. In particular you can work around the issue mostly by providing an explicit port, but that stops working as soon as there is a redirect to a https location (see https://github.com/ruby-rdf/sparql-client/blob/develop/lib/sparql/client.rb#L685).

gkellogg commented 9 years ago

@joerixaop yes, there's no good reason to use RDF::URI in this case.

gkellogg commented 9 years ago

@joerixaop can you check out the feature branch where I checked this change into to see if it helps you. If so, I'll merge it in.

joerixaop commented 9 years ago

Seems to work (to be honest the redirect issue was hypothetical, so I didn't get to test that, but there is no reason why it would fail)