ipld / js-car

Content Addressable aRchive format reader and writer for JavaScript
Other
46 stars 7 forks source link

Propsoal: New API to create a writer with unknown root #68

Closed Gozala closed 2 years ago

Gozala commented 2 years ago

In many cases we want to frame block sets into CAR files and once CAR reaches certain size, write a root block and end the frame. This use case is not supported by the current interface as you need to know the root ahead of time, which we do not.

It is still possible to create a car writer with fake root, buffer output into memory and then use updateRootsInBytes but that is really awkward interface.

I would like to propose an alternative CarWriter interface that would better support outlined use case.

export class CarWriter2 extends CarWriter {
  /**
    * Create a car writer with given root capacity. No blocks will be emitted into `out` until
    * number of roots matching the `count` are added.
    */
  static createWithRootCapacity(byteLength:number):{ writer:CarWriter2, out:AsyncIterable<Uint8Array> }
  /**
   * Throws an error if total root count is greater than root capacity specified at creation.
   */
  addRoots(roots:CID[]): void

  /**
   * Promise fails if root capacity has not been met (not enough roots had been added)
   */
  close(): Promise<void>
}

In practice I expect we could just amend current implementation as opposed to having a separate class as in the sketch above.

rvagg commented 2 years ago

So this would just buffer all the Uint8Array chunks in memory until it has the roots it needs? Seems like a foot-gun prone interface so it would need some solid documentation around it - DO NOT USE THIS IF YOU HAVE A LOT OF DATA. Maybe the interface could be more suggestive too: createBufferingUntilRootsAdded() or something explicit so the user has to type out their intention before they opt in to this.

It wouldn't really even need a byteLength argument because it's going to be holding on to a list of Uint8Arrays anyway and it just needs to create the header when it gets the roots and spit that out first. Nothing about the bytes that it buffers is impacted by the length of the header. You just get a one-shot addRoots() call to open the flood gates.

But otherwise, this seems reasonable.

Gozala commented 2 years ago

Closing this in favor of #69