phetsims / query-string-machine

Query String Machine is a query string parser that supports type coercion, default values & validation. No dependencies.
MIT License
3 stars 3 forks source link

Improve support for multiple question marks in query string #33

Closed zepumph closed 4 years ago

zepumph commented 5 years ago

In PhET-iO we have been using multiple question marks in our query strings. This seems to be legal, as the URI should only care about the first one (https://stackoverflow.com/questions/2924160/is-it-valid-to-have-more-than-one-question-mark-in-a-url).

QueryStringMachine doesn't handle it quite as well. See here from the console when parsing from string:

QueryStringMachine.getForString( 'url', {type: 'string'}, '?url=../login/login.html?
screens=2.3&wrapper=record&validationRule=validateDigits')
-->"../login/login.html?screens"

QueryStringMachine.getForString( 'screens', {type: 'string'}, '?url=../login/login.html?
screens=2.3&wrapper=record&validationRule=validateDigits')
-->QueryStringMachine.js:562 Error for query parameter "screens": missing required query parameter: 
screens

QueryStringMachine.getForString( 'wrapper', {type: 'string'}, '?url=../login/login.html?
screens=2.3&wrapper=record&validationRule=validateDigits')
-->"record"

QueryStringMachine.getForString( 'validationRule', {type: 'string'}, '?url=../login/login.html?
screens=2.3&wrapper=record&validationRule=validateDigits')
--> "validateDigits"

QueryStringMachine.getForString( 'wrapper', {type: 'string'}, '?url=../login/login.html?
screens=2.3&wrapper=????fdsa=4&validationRule=validateDigits')
--> "????fdsa"

QueryStringMachine.getForString( 'wrapper', {type: 'string'}, '?wrapper=????fdsa=4')
--> "????fdsa"

I would expect that getting url would return url=../login/login.html? screens=2.3 instead of just the screens word.

zepumph commented 5 years ago

I'm not sure who to assign this to or who knows the code best. @samreid what do you think.

samreid commented 5 years ago

After the above commit, here are the new results for the above expressions:

QueryStringMachine.getForString( 'url', {type: 'string'}, '?url=../login/login.html?screens=2.3&wrapper=record&validationRule=validateDigits')
"../login/login.html?screens=2.3"

QueryStringMachine.getForString( 'screens', {type: 'string'}, '?url=../login/login.html?screens=2.3&wrapper=record&validationRule=validateDigits')
QueryStringMachine.js:564 Error for query parameter "screens": missing required query parameter: screens
QueryStringMachine.js:565 Uncaught Error: Assertion failed: missing required query parameter: screens
    at queryStringMachineAssert (QueryStringMachine.js:565)
    at parseValues (QueryStringMachine.js:433)
    at Object.getForString (QueryStringMachine.js:94)
    at <anonymous>:1:20
queryStringMachineAssert @ QueryStringMachine.js:565
parseValues @ QueryStringMachine.js:433
getForString @ QueryStringMachine.js:94
(anonymous) @ VM2461:1

QueryStringMachine.getForString( 'wrapper', {type: 'string'}, '?url=../login/login.html?screens=2.3&wrapper=record&validationRule=validateDigits')
"record"

QueryStringMachine.getForString( 'validationRule', {type: 'string'}, '?url=../login/login.html?screens=2.3&wrapper=record&validationRule=validateDigits')
"validateDigits"

QueryStringMachine.getForString( 'wrapper', {type: 'string'}, '?url=../login/login.html?screens=2.3&wrapper=????fdsa=4&validationRule=validateDigits')
"????fdsa=4"

QueryStringMachine.getForString( 'wrapper', {type: 'string'}, '?wrapper=????fdsa=4')
"????fdsa=4"

Note that the first case is fixed but the latter cases are the same. I'm not sure how to tackle those latter cases--it seems Query String Machine would have to know to treat the tail as a separate URL, right? Not 100% sure what the desired behavior is here. @zepumph can you please comment?

zepumph commented 5 years ago

I would say that everything you posted looks correct to me. In the first query string, only the screens parameter belongs nested in the "url" query parameter. If later parameters are part of the nested parameter, then the ampersands need to be escaped.

I think that for quite some time our wrapper query parameter handling has been haphazard. It would be good to be consistently encoding URLs if we use them as values for query parameters. Overall I think that's the right direction to take. That will make development more difficult because you lose the ability to paste in a url as a parameter (I think that's useful sometimes?). It shouldn't matter for the diff wrapper though.

Does that sound alright to you?

samreid commented 5 years ago

Maybe the proper way for us to support URLs in query parameters is to invent a new syntax that indicates the begin/end of URL params. For example:

http://localhost/page.html?audio=mute&ea&url=<TERM>http://localhost/test.html&ea</TERM>&otherTopLevelParam

The proposal is a nesting syntax that would describe which query parameters go with which ones, and would support arbitrarily deep nesting . This would give us full control over putting query parameters with top-level or nested terms.

To clarify: I'm not (yet?) strongly advocating for something like that, and that syntax wouldn't be final, but I just wanted to brainstorm an idea that supports nesting. Thoughts?

zepumph commented 4 years ago

@samreid I think that the way to fix this is to URL encode a URL is we want it to be in a URL. That is most idiomatic, flexible, and long term.

samreid commented 4 years ago

I tested using URLSearchParams for this, see #36, and it did not solve the problem:

var paramsString = "q=URLUtils.searchParams&topic=api&url=http://www.colorado.edu/sims?simName=energySim&simTime=severalMinutes";
var searchParams = new URLSearchParams(paramsString);

//Iterate the search parameters.
for (let p of searchParams) {
  console.log(p);
}
["q", "URLUtils.searchParams"]
["topic", "api"]
["url", "http://www.colorado.edu/sims?simName=energySim"]
["simTime", "severalMinutes"]

Encoding URLs sounds best. Does that mean nothing would be left to implement in Query String Machine and this issue can be closed?

zepumph commented 4 years ago

Yes I think that is correct, though some wrapper code may need to be updated for this the next time we are doing record/playback studies.