getsentry / rrweb

record and replay the web
https://www.rrweb.io/
Other
9 stars 5 forks source link

Mutations are capturing file uploads #80

Closed billyvg closed 1 year ago

billyvg commented 1 year ago

Mutations are capturing file uploads, which cause the replay to become very large. There's no reason to capture this data.

Example snippet:

{
  "type": 3,
  "data": {
    "source": 5,
    "text": "C:\\fakepath\\foo.zip",
    "isChecked": false,
    "id": 10513
  },
  "timestamp": 1674760126020
},
{
  "type": 3,
  "data": {
    "source": 5,
    "text": "",
    "isChecked": false,
    "id": 10513
  },
  "timestamp": 1674760126047
},
{
  "type": 3,
  "data": {
    "source": 0,
    "texts": [],
    "attributes": [
      {
        "id": 10517,
        "attributes": {
          "src": "data:application/x-zip-compressed;base64,UEsDBBQACAgIAMJYOlYAAAAAAAAAAAAAAAATABwAMjAyMzAxMjZfMTQwNDM4LmpwZ3VwGAAB3n1+lzIwMjMwMTI2XzE0MDQzOC5qcGcAAID/f//Y/+EC3EV4aWYAAE1NACoAAAAIAA0BAAADAAAAAQ/AAAABAQADAAAAAQvQAAABDwACAAAACAAAAKoBEAACAAAACQA...
mydea commented 1 year ago

Hmm, would it be enough to make sure we ignore input[type="file"]? 🤔 via the ignore option by default?

mydea commented 1 year ago

Actually I guess this is actually two issues:

  1. do not capture the file input changes themselves
  2. In the screen above, I guess this does something like the following:
const fileInput = document.querySelector('#file-input');
fileInput.addEventListener('change', (event) => {
  const previewImage = document.querySelector(#preview-image');
  previewImage.src = convertEventToBase64Img(event);
}

Which would not really be prevented, and basically amounts to inlining the image.

I guess to "fix" this we should replace base 64 strings with placeholders? E.g.

// extend this function
function transformAttribute() {
  if(value.test(/^data:(.*);base64,(.*)$/) {
    // replace with placeholder base64?
}

Does that make sense?

billyvg commented 1 year ago

Yeah I think it makes sense to strip inlined base 64 content from attributes.

billyvg commented 1 year ago

Update: We've decided not to act on any additional inlined base64 content for now and continue monitoring our segment sizes.