jshttp / negotiator

An HTTP content negotiator for Node.js
MIT License
320 stars 36 forks source link

[Bug] parseCharset(str, i) changes value of i #55

Closed AnyhowStep closed 5 years ago

AnyhowStep commented 5 years ago

I'm curious about parseCharset(str, i) and this i parameter. https://github.com/jshttp/negotiator/blob/master/lib/charset.js#L53


I see we have var i=0 and i++ here, https://github.com/jshttp/negotiator/blob/master/lib/charset.js#L61

And at the very end, we have, https://github.com/jshttp/negotiator/blob/master/lib/charset.js#L73

return {
  charset: charset,
  q: q,
  i: i,
};

It seems to me like it's possible for the i at the return statement to have a different value from the i given in the parameter. It's different because var i=0 and i++ modify the value of the parameter.

For example,

function parseCharset (str , i) {
    for (var i = 0; i < 4; i ++) {
        console.log(i);
    }

    return {
        i: i
    };
}
//The output is {i: 4}, not {i: 99}
parseCharset("", 99)

If we replace var i with let i, we get {i: 99}


Right now, I'm seeing i's value change between the start and the end of the function. Is this intentional? Or is this a mistake?

If it's intentional, my question is, what does it do, exactly? The code's a bit hard for me to read so I can't quite get the intent of it just yet.

AnyhowStep commented 5 years ago

Okay, after reading the code some more, I think get what's going on.

parseCharset(str, i) takes a str of the following formats,

utf-8
iso-8859-1;q=0.5
*;q=0.1

The interesting bit is when we have the "quality value" specified (q=<number>). https://github.com/jshttp/negotiator/blob/master/lib/charset.js#L60

        //Replaced `const` with `var` to show that those values do not change
        const params = match[2].split(';')
        //This `var i` is problematic
        //It is only ever used as a loop index to access `params[i]`
        //However, we accidentally overwrite *parameter* `i`, this is very likely unintentional
        for (var i = 0; i < params.length; i ++) {
            const p = params[i].trim().split('=');
            if (p[0] === 'q') {
                q = parseFloat(p[1]);
                break;
            }
        }

So, the i value being changed seems like a bug.

AnyhowStep commented 5 years ago

With the above bug, it's possible for me to craft a header that breaks preferredCharsets(),

//Notice the double semi-colon, "first;;q"
preferredCharsets("first;;q=1,second;q=1,third")

Expected,

["first", "second", "third"]

Actual,

["second", "first", "third"]

Of course, in practice, I don't think anyone would craft such a header.


There are other headers that end up relying on Array.sort() being stable (This is not guaranteed!),

parseAcceptCharset("first;q=0.1,second;q=0.1,third")
[
  {
    "charset": "first",
    "q": 0.1,
    "i": 0
  },
  {
    "charset": "second",
    "q": 0.1,
    "i": 0
  },
  {
    "charset": "third",
    "q": 1,
    "i": 2
  }
]

If we were to run sort() with compareSpecs() on the array, it's possible for "first" and "second" to be switched around because compareSpecs() would return 0.

However, since Array.sort() seems to be stable in most environments, we don't notice the bug as much.

https://stackoverflow.com/questions/3026281/what-is-the-stability-of-the-array-sort-method-in-different-browsers

dougwilson commented 5 years ago

Thanks for the report. The first bug you found is also resulting in the sort instability as well. The i property is what keeps the sort stable, as long as it is actually not getting altered in that for loop. I have a fix.

If I'm misunderstanding you on the sort issue, though, please take a look at the commit that closed this issue and if you don't think that fixes the sort issue, if you can provide some reproduction steps based on the current master branch, I can get a fix for that too 👍

AnyhowStep commented 5 years ago

Just tested and your commit fixes everything. Thank you for handling this so quickly!