cloudflare / har-sanitizer

https://har-sanitizer.pages.dev/
Apache License 2.0
202 stars 18 forks source link

HAR produced after sanitization is no longer valid json #22

Closed ruairica closed 2 months ago

ruairica commented 11 months ago

This results in an error like SyntaxError: Expected ',' or ']' after array element in JSON at position 676059 (line 8188 column 9) when I try to download.

Here is a snippet of a request (I've already replaced some of the values manually so I could post it here) from my file to give a sample of the problem which I've stringified for testing.

    let snippet = JSON.stringify(
        {
            request: {
                method: "POST",
                url: "https://www.example.com",
                queryString: [],
                cookies: [
                    {
                        name: "TS0143f862",
                        value:
                            "12345678900000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
                        path: "/",
                        domain: "www.example.net",
                        expires: "1969-12-31T23:59:59.000Z",
                        httpOnly: false,
                        secure: false,
                    },
                    {
                        name: "TS01ae66c9",
                        value:
                            "12345678900000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
                        path: "/",
                        domain: "www.example.net",
                        expires: "1969-12-31T23:59:59.000Z",
                        httpOnly: false,
                        secure: false,
                    },
                ],
                headersSize: 1862,
                bodySize: 86,
                postData: {
                    mimeType: "application/json",
                    text: '{"pageIndex":0,"pageSize":10,"sortByColumn":"Date","sortByColumnDesc":true,"filter":1}',
                },
            },
        },
        null,
        2,
    );

From my original file I've selected just cookie TS0143f862 to be sanitized.

I've narrowed the issue down to : wordSpecificScrubList[1] (from const wordSpecificScrubList = wordList.map((word) => buildRegex(word)); )

which for my original file was:

{
regex: /("name": "TS0143f862",[\s\w+:"-\%!*()`~'.,#]*?"value": ")([\w+-_:&\+=#~/$()\.\,\*\!|%"'\s;{}]+?)("[\s]+){1}/g
replacement: "$1[TS0143f862 redacted]$3"
}

.replace(regex, replacement) on the provided snippet outputs this string, note how the cookies array is no longer closed.

 "request": {
          "method": "POST",
          "url": "https://www.example.com",
          "queryString": [],
          "cookies": [
            {
              "name": "TS0143f862",
              "value": "[TS0143f862 redacted]"
          }
        }