bio-tools / biotoolsSchema

biotoolsSchema : Tool description data model for computational tools in life sciences
Creative Commons Attribution Share Alike 4.0 International
36 stars 12 forks source link

Consider disallowing '|' characters in URLs #193

Closed joncison closed 4 years ago

joncison commented 4 years ago

Some fields see mis-used pipe character to delimit individual URLs, which is likely to break things downstream. I think '|' is considered URL-UNsafe.

See https://github.com/bio-tools/biotoolsRegistry/issues/489

joncison commented 4 years ago

from discussion with @hansioan a closer validation could be are there multiple "http" in the URL

after all any character could be used as a delimiter, so this is a potentially never-ending problem

not sure this can be encoded in a regex - if not then it's an issue for biotoolsLint

joncison commented 4 years ago

cc @hans FYI the '|' (pipe) character is unsafe according to the RFC1738, but RFC 3986 allows for the encoding of unicode characters (ie '|' as an unicode hexadecimal value of %7C or 0x007C).

So for now let's leave it as-is for ("wontfix")