greggman / unzipit

Random access unzip library for JavaScript
https://greggman.github.io/unzipit
Other
131 stars 14 forks source link

Read zip inside of a zip #36

Open mmamedel opened 1 month ago

mmamedel commented 1 month ago

Any recommendation of how I could accomplish reading a zip inside another one without extracting more files than needed? My situation is something like:

Thank you

mmamedel commented 1 month ago

Oh just saw issue #7. Sorry for the duplication.

mmamedel commented 1 month ago

You know, my case is exactly what is described in #7: everything is uncompressed. Would you have any recommendation to use an uncompressed zip entry in a new unzipit?

mmamedel commented 1 month ago

With some minor tweaks I was able to write a modification of the HTTPRangeReader:

export class HTTPRangeEntryReader {
  private length: number;
  private entryReader: HTTPRangeReader;
  private url: string;
  constructor(private entry: ZipEntry) {
    if (
      this.entry._rawEntry.compressedSize !==
      this.entry._rawEntry.uncompressedSize
    ) {
      throw new Error(
        `Entry "${this.entry.name}" is compressed, cannot read it directly`
      );
    }
    this.length = this.entry._rawEntry.compressedSize;
    this.entryReader = this.entry._reader;
    this.url = this.entry._reader.url;
  }
  async getLength() {
    return this.length;
  }
  async read(offset: number, size: number) {
    if (size === 0) {
      return new Uint8Array(0);
    }
    const { fileDataStart } = await readEntryDataHeader(
      this.entryReader,
      this.entry._rawEntry
    );
    console.log(fileDataStart);
    const req = await fetch(this.url, {
      headers: {
        Range: `bytes=${fileDataStart + offset}-${
          fileDataStart + offset + size - 1
        }`,
      },
    });
    if (!req.ok) {
      throw new Error(
        `failed http request ${this.url}, status: ${req.status} offset: ${offset} size: ${size}: ${req.statusText}`
      );
    }
    const buffer = await req.arrayBuffer();
    return new Uint8Array(buffer);
  }
}

With this I could get my nested zip entries with this code:

  const reader = new HTTPRangeReader(url);
  const zipInfo = await unzip(reader);

  const reader2 = new HTTPRangeEntryReader(zipInfo.entries['innerFile.zip']);
  const zipInfo2 = await unzip(reader2);
  console.log(zipInfo2.entries);

Is this a concept that we could have added to this library? Thank you

mmamedel commented 1 month ago

With a bit more work HTTPRangeReader could have both functionalities.