tweaselORG / TrackHAR

Library for detecting tracking data transmissions from traffic in HAR format.
Creative Commons Zero v1.0 Universal
5 stars 0 forks source link

`google/app-measurement`: Can't decode Protobuf #10

Closed baltpeter closed 1 year ago

baltpeter commented 1 year ago

While working on https://github.com/tweaselORG/tracker-wiki/issues/3, I noticed that the google/app-measurement adapter fails to decode the Protobuf body.

Error:

/home/benni/tmp/abc/node_modules/trackhar/dist/index.js:3333
        else throw new Error("Protobuf input must be a byteArray or Uint8Array");
                   ^

Error: Protobuf input must be a byteArray or Uint8Array
    at $af1843b68b6e5fd0$export$2c626d165d0efff4 (/home/benni/tmp/noyb-demo/node_modules/trackhar/dist/index.js:3333:20)
    at Function.mergeDecodes (/home/benni/tmp/noyb-demo/node_modules/trackhar/dist/index.js:3447:20)
    at Function.decode (/home/benni/tmp/noyb-demo/node_modules/trackhar/dist/index.js:3405:21)
    at Object.decodeProtobuf (/home/benni/tmp/noyb-demo/node_modules/trackhar/dist/index.js:4001:77)
    at $149c1bd638913645$var$decodeRequest (/home/benni/tmp/noyb-demo/node_modules/trackhar/dist/index.js:4076:76)
    at $149c1bd638913645$export$c49bcb71aff21fdc (/home/benni/tmp/noyb-demo/node_modules/trackhar/dist/index.js:4085:28)
    at Array.map (<anonymous>)
    at $149c1bd638913645$export$e54fe5b0f43758f7 (/home/benni/tmp/noyb-demo/node_modules/trackhar/dist/index.js:4104:87)
    at <anonymous> (/home/benni/tmp/noyb-demo/data.ts:7:24)

The problem is that input is always a string, whereas the Protobuf decoder expects a byteArray or Uint8Array.

baltpeter commented 1 year ago

My first idea was to wrap this in a Buffer.from(input) to get a buffer. Now, it fails with:

/home/benni/tmp/abc/node_modules/trackhar/dist/index.js:3605
        if (this.offset > this.LENGTH) throw new Error("Exhausted Buffer");
                                             ^

Error: Exhausted Buffer
    at $af1843b68b6e5fd0$export$2c626d165d0efff4._parse (/home/benni/tmp/noyb-demo/node_modules/trackhar/dist/index.js:3605:46)
    at Function.mergeDecodes (/home/benni/tmp/noyb-demo/node_modules/trackhar/dist/index.js:3448:28)
    at Function.decode (/home/benni/tmp/noyb-demo/node_modules/trackhar/dist/index.js:3405:21)
    at Object.decodeProtobuf (/home/benni/tmp/noyb-demo/node_modules/trackhar/dist/index.js:3996:77)
    at $149c1bd638913645$var$decodeRequest (/home/benni/tmp/noyb-demo/node_modules/trackhar/dist/index.js:4076:76)
    at $149c1bd638913645$export$c49bcb71aff21fdc (/home/benni/tmp/noyb-demo/node_modules/trackhar/dist/index.js:4085:28)
    at Array.map (<anonymous>)
    at $149c1bd638913645$export$e54fe5b0f43758f7 (/home/benni/tmp/noyb-demo/node_modules/trackhar/dist/index.js:4104:87)
    at <anonymous> (/home/benni/tmp/noyb-demo/data.ts:7:24)

Node.js v18.15.0
baltpeter commented 1 year ago

https://stackoverflow.com/a/45722000/3211062 has the answer: The problem is that our string has byte sequences that are not valid UTF-8.

And Buffer.from(string) defaults to UTF-8 as the encoding.

Instead of the proposed solution in the answer, we can also simply use Buffer.from(input, 'binary').