Open ldodds opened 10 years ago
The goal here was to try and improve the validation around URIs.
Currently the code use URI.parse. This will catch some errors but also lets through some values which probably shouldn't be treated as a URI. For example it parses any string as a valid relative URI. Looking again at the definition of xsd:anyURI that might be fine.
We also check to see whether its a http or https URI. This was an attempt to improve things, but may be overly limiting.
So the issue was to decide whether we wanted to keep what we are doing or improve things based on expected use cases for URIs in CSV data.
Hmmm... Yeah, I see what you mean now. Looking at the spec, I think you're right, a xsd:anyURI
defines a URI to be relative or absolute, so I think what we have is actually fine. We should probably get rid of the checking for http or https too.
I've been giving this a bit more thought, and I think we should leave it as is. In most (if not all) instances, people are going to be using absolute URIs, and if we change it so it includes relative URIs, it'll match pretty much everything as a URI, which will mean a CSV with columns that are mainly URIs, but with the odd line of (unspaced) text will validate correctly.
https://github.com/sporkmonger/addressable looks promising