web3-storage / web3.storage

DEPRECATED ⁂ The simple file storage service for IPFS & Filecoin
https://web3.storage
Other
502 stars 122 forks source link

Can I change the default chuck size? #954

Closed akazwz closed 2 years ago

akazwz commented 2 years ago

Default chunk size is about 10MB, So the progress won't change for a long time, is there any way I can change the chunk size so I can let the progress change more frequently. thanks

olizilla commented 2 years ago

It's not possible to configure the CAR chunk size with the js library today.

https://github.com/web3-storage/web3.storage/blob/736610f395dbf6302e9ae0b9b38df2c94e35aa32/packages/client/src/lib.js#L135

A PR to make this configurable would be rad!

akazwz commented 2 years ago

How about adding maxChunkSize to PutCarOptions and PutOptions , like:

export type PutCarOptions = {
  /**
   * Human readable name for this upload, for use in file listings.
   */
  name?: string
  /**
   * Callback called after each chunk of data has been uploaded. By default,
   * data is split into chunks of around 10MB. It is passed the actual chunk
   * size in bytes.
   */
  onStoredChunk?: (size: number) => void
  /**
   * Maximum times to retry a failed upload. Default: 5
   */
  maxRetries?: number
+ /**
+ * Maximum chunk-size to upload. Default: 1024 * 1024 * 10
+ */
+ maxChunkSize?: number
  /**
   * Additional IPLD block decoders. Used to interpret the data in the CAR file
   * and split it into multiple chunks. Note these are only required if the CAR
   * file was not encoded using the default encoders: `dag-pb`, `dag-cbor` and
   * `raw`.
   */
  decoders?: BlockDecoder<any, any>[]
}

Then in static async putCar, do some changes like:

static async putCar ({ endpoint, token }, car, {
    name,
    onStoredChunk,
    maxRetries = MAX_PUT_RETRIES,
+   maxChunkSize,
    decoders,
  } = {}) {
-   const targetSize = MAX_CHUNK_SIZE
+   const targetSize = maxChunkSize ?? MAX_CHUNK_SIZE
    const url = new URL('car', endpoint)
    let headers = Web3Storage.headers(token)

    if (name) {
      headers = { ...headers, 'X-Name': encodeURIComponent(name) }
    }

    const roots = await car.getRoots()
    if (roots[0] == null) {
      throw new Error('missing root CID')
    }
    if (roots.length > 1) {
      throw new Error('too many roots')
    }

    const carRoot = roots[0].toString()
    const splitter = new TreewalkCarSplitter(car, targetSize, { decoders })

    /**
     * @param {AsyncIterable<Uint8Array>} car
     * @returns {Promise<CIDString>}
     */
    const onCarChunk = async car => {
      const carParts = []
      for await (const part of car) {
        carParts.push(part)
      }

      const carFile = new Blob(carParts, { type: 'application/car' })
      const res = await pRetry(
        async () => {
          const request = await fetch(url.toString(), {
            method: 'POST',
            headers,
            body: carFile
          })
          const res = await request.json()
          if (!request.ok) {
            throw new Error(res.message)
          }

          if (res.cid !== carRoot) {
            throw new Error(`root CID mismatch, expected: ${carRoot}, received: ${res.cid}`)
          }
          return res.cid
        },
        { retries: maxRetries }
      )

      onStoredChunk && onStoredChunk(carFile.size)
      return res
    }

    const upload = transform(MAX_CONCURRENT_UPLOADS, onCarChunk)
    for await (const _ of upload(splitter.cars())) {} // eslint-disable-line
    return carRoot
  }

And then do some changes in other functions like static async put. Will this work?

olizilla commented 2 years ago

@akazwz yes that's a good start! It's a pretty safe change, there shouldn't be any suprises. Do you fancy opening a PR?

olizilla commented 2 years ago

worth mentioning in the comment for maxChunkSize that it's in bytes, and that it's used as the targetSize as passed to carbites TreewalkSplitter https://github.com/nftstorage/carbites

akazwz commented 2 years ago

@akazwz yes that's a good start! It's a pretty safe change, there shouldn't be any suprises. Do you fancy opening a PR?

Sure, I try to do some changes and have already opened a PR, can you have a look when you are free