Open NicolaIsotta opened 1 year ago
The problem is, that you did not specify an URI, at least to my understanding. I had a look at RFC 3986 (Uniform Resource Identifier (URI): Generic Syntax). My understanding is, that the problematic part is parsed as:
-> URI (https://unpkg.com/primeflex@^3/primeflex.css
)
-> hier-part (//unpkg.com/primeflex@^3/primeflex.css
)
-> path-abempty (/primeflex@^3/primeflex.css
)
-> segment (primeflex@^3
)
Below the segment is no matching construction. segment
is defined to be a list of pchar
elements, which can be::
ALPHA / DIGIT / "-" / "." / "_" / "~"
)"%" HEXDIG HEXDIG
)"!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
):
@
none of these match "^".
TL;DR: The browsers are wrong to accept this as an URL and not reject it.
The good news: If you use a valid URL it still works: https://unpkg.com/primeflex@%5E3/primeflex.css
Please indicate, if this helps.
I've tested other browsers/IDEs and they seem to automatically encode the URI. Here's a VS Code screenshot for example: Maybe encoding the string according to RFC3986 before creating the URI can avoid this kind of exceptions?
Sorry, I don't see a sane way to do this. When would you encode a character and when not? You can argue, that you can guess that a not allowed character gets encoded, others don't, but then what do you make from this:
https://test.invalid/path%20with%20spaces
When trying to guess if this has to be decoded, the path component might mean /path%20with%20spaces
or /path with spaces
. On the other hand encoded it might also be https://test.invalid/path%2520with%2520spaces
.
As you might have guessed from my reply, I don't like this "let`s interpret the most broken code until it works somehow" attitude in web development. If someone can answer this problem, without security problems, I'll review a fix, until that happens from my POV this works as designed.
As far as I know browser also encode url parts like space to %20 and so on and if you put this URL with encoded space again to the addressbar it will not encode it again. Yes, it is not like the designed stuff but we all know that there is stuff that could help which was not designed before. This is just a better developer experience and of course I see your point. Should this be better than a designed RFC which can't handle this? I would say it depends. Also maybe they forgot to add ^. I would just make it better than the RFC. My 2 cents.
I also had a quick look into the RFC but I just saw the regex in Appendix B and I'm not familar with RFCs a lot but when I check the regex with the given URL, there is no problem parsing it: https://regex101.com/r/S3E5BM/1 and it matches tha part after the tld correctly in one part. Yes the regex is seems more generic and less errornous.
I think it's a valid URL according to the HTML spec - bit old but see this https://www.w3.org/TR/2011/WD-html5-20110525/urls.html#parsing-urls which specifically mentions that character.
Ok fine. So HTML even made it written, that it deliberately breaks existing specifications, invalidating existing tools. Great they just went down another notch on my respect scale. Lets reopen and see if anyone is willing to fix this mess and write an "HTML URL" to "real URL" translator to handle these cases.
Looks like there might be a few library options that could handle this (eg. OkHttp HttpUrl) ?
@matthiasblaesing yes, this bit is great! :grimacing:
The term "URL" in this specification is used in a manner distinct from the precise technical meaning it is given in RFC 3986. Readers familiar with that RFC will find it easier to read this specification if they pretend the term "URL" as used herein is really called something else altogether. This is a willful violation of RFC 3986.
https://github.com/smola/galimatias might be alternative. We use it already in the context of the httpparser/validator.
Apache NetBeans version
Apache NetBeans 18
What happened
Remote CSS is not shown if its url contains a
^
How to reproduce
Add this to an html page:
Did this work correctly in an earlier version?
No / Don't know
Operating System
Windows 10 version 10.0 running on amd64; Cp1252; it_IT (nb)
JDK
11.0.17; OpenJDK 64-Bit Server VM 11.0.17+8
Apache NetBeans packaging
Apache NetBeans binary zip
Anything else
stack trace
Are you willing to submit a pull request?
No