Touffy / client-zip

A client-side streaming ZIP generator
MIT License
353 stars 23 forks source link

[BUG] testing with string input leads to uncompressed zip files #55

Closed rjwalters closed 1 year ago

rjwalters commented 1 year ago

I am getting valid zip files, but the data is not compressed.

This is my first attempt to use a web worker and it has been much harder than I expected. I still haven't managed to setup CORS on S3 yet so that I can bundle files stored there, but in the meantime I've been trying to get everything else in place.

I'm using svelte-kit, and talking to the worker within the onMount() lifecycle function so that the browser is available.

onMount(async () => {
  await navigator.serviceWorker.register(
    `${$page.url.protocol}//${$page.url.host}/clientZipWorker.js`
  );
});

async function downloadAll(_e: Event) {
  let zipFileName = `bundle.${content.id}.zip`;
  let dateNow = new Date();
  const data = content.files.map((f) => {
    return {
      name: f.name,
      updated_at: f.updated_at,
      file_url: f.file_url,
      size_bytes: f.size_bytes,
    };
  });
  let keepAlive = setInterval(fetch, 4000, 'clientZipWorker/keep-alive', {
    method: 'POST',
  });
  const response = await fetch(`clientZipWorker/${zipFileName}`, {
    method: 'POST',
    body: JSON.stringify(data),
  });
  if (response.ok) {
    let url = URL.createObjectURL(await response.blob());
    downloadURL(url, zipFileName);
    URL.revokeObjectURL(url);
  }
  clearInterval(keepAlive);
}

My worker looks like this:


// I have pasted all content from https://unpkg.com/client-zip@2.3.0/worker.js here
// 
//  var downloadZip=(()=>{"stream"in Blob ... etc
//

function randomString(length) {
  let result = '';
  const characters =
    'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';
  const charactersLength = characters.length;
  let counter = 0;
  while (counter < length) {
    let r = characters.charAt(Math.floor(Math.random() * charactersLength));
    result = result + r + r + r; // easy to compress
    counter += 3;
  }
  return result;
}

let DEBUG_FILE_LENGTH = 100000;

async function* mockActivateFiles(files) {
  for (const f of files) {
    console.log('mock input for: ', f.name);
    yield {
      name: f.name,
      lastModified: f.lastModified,
      input: randomString(DEBUG_FILE_LENGTH),
    };
  }
}

// must be sync
function* mockActivateMetadata(metadata) {
  for (const m of metadata) {
    console.log('mock metadata for: ', m.name);
    yield {
      name: m.name,
      size: DEBUG_FILE_LENGTH,
    };
  }
}

self.addEventListener('activate', () => {
  console.log('worker activated');
  clients.claim();
});

self.addEventListener('fetch', async (event) => {
  const url = new URL(event.request.url);
  const [, name] = url.pathname.match(/\/clientZipWorker\/(.+)/i) || [,];

  if (url.origin === self.origin && name) {
    console.log('worker matched fetch event:', name);

    event.respondWith(
      event.request
        .json()
        .then((data) => {
          const files = data.map((d) => {
            return {
              name: d.name,
              lastModified: d.updated_at,
              file_url: d.file_url,
            };
          });
          const metadata = data.map((d) => {
            return {
              name: d.name,
              size: d.size_bytes,
            };
          });
          return downloadZip(mockActivateFiles(files), {
            metadata: mockActivateMetadata(metadata),
          });
        })
        .catch((err) => {
          return new Response(err.message, { status: 500 });
        })
    );
  }
});
rjwalters commented 1 year ago

image

I get a valid zip and the file sizes are as I expect -- the files compress about 3:1 as expected with 7zip.

rjwalters commented 1 year ago

Reading through the docs again...

client-zip does not support compression

So I am closer than I thought, but also farther... I may go back to trying https://github.com/robbederks/downzip again

Touffy commented 1 year ago

Hey Robb. Yep, no compression (at least in the near future). But if you actually need compression and you can explain why, that would inform my decision to eventually implement compression in client-zip. The native CompressionStream has already landed in Chromium, so that feature may be relatively cheap to support next year when Firefox and Safari have caught up.

rjwalters commented 1 year ago

I thought about this more overnight and I've concluded that you are probably correct about compression.

It's obvious in retrospect, but I wasn't really thinking clearly about how this works as a client side library -- i.e. the data has to move to the client before compression would start so there is no bandwidth advantage.

The only use case I can construct for compressing while downloading is for a user who wants to archive the files as a bundle without accessing any of them right away. In this case we might save our user the extra step of recompressing the bundle if storage is more valuable than CPU or if running a compression program is a hassle. I can imagine this being the case on mobile devices.

A second use case might be to reduce customer complaints from people who make the same mistake as me and report a bug about my zip compression being broken. ;)

Ultimately I think the right answer for mobile customers is for me to move file bundling to the server side. I will end up using the full bandwidth to stream the files off S3 through whatever server that does the compression work but the user will get all the benefit of reduced bandwidth, reduced storage size, low cpu, and a zip file that "looks right"