web3-storage / web3.storage

DEPRECATED ⁂ The simple file storage service for IPFS & Filecoin
https://web3.storage
Other
503 stars 119 forks source link

Error: missing block when running put-files.js example from getting started docs #302

Closed jimpick closed 3 years ago

jimpick commented 3 years ago

I'm using the unmodified script from here:

https://docs.web3.storage/#create-the-upload-script

$ node put-files.js --token=$WEB3_STORAGE_API_TOKEN ../estuary-archive/ips-baidu
Uploading 46 files
/Users/jim/projects-jpimac/spark/miner-power/node_modules/carbites/cjs/lib/treewalk/splitter.js:68
      throw new Error(`missing block for ${ cid }`);
            ^

Error: missing block for bafybeige76t5nrfr5x46yjs4j37nibtkgxfe6v23qmd57ytwjgjrcyd32e
    at TreewalkCarSplitter._get (/Users/jim/projects-jpimac/spark/miner-power/node_modules/carbites/cjs/lib/treewalk/splitter.js:68:13)
    at async TreewalkCarSplitter._cars (/Users/jim/projects-jpimac/spark/miner-power/node_modules/carbites/cjs/lib/treewalk/splitter.js:80:19)
    at async TreewalkCarSplitter.cars (/Users/jim/projects-jpimac/spark/miner-power/node_modules/carbites/cjs/lib/treewalk/splitter.js:54:22)
    at async fillQueue (/Users/jim/projects-jpimac/spark/miner-power/node_modules/streaming-iterables/dist/index.js:706:41)

The same script did work on some other directories with files. I'll do a bit more debugging to see if I can isolate it further.

jimpick commented 3 years ago

I did some debugging, and it appears that the iterable car file stream being output from ipfs-car/pack hasn't finished writing blocks when it is being consumed by TreewalkCarSplitter.fromIterable, and @ipld/car stops reading the stream.

A quick fix is to just add an await to the close() function in the ipfs-car/pack writer.

https://github.com/web3-storage/ipfs-car/pull/73

jimpick commented 3 years ago

Update: My "quick fix" didn't actually fix the problem.

I think the issue is that my data generates an IPLD block with zero bytes in a buffer, which has a valid CID, but if that block is placed in the middle of a car file, the @ipld/car code to read from an iterator interprets that as the end of the stream, and it stops reading from the iterator, even though there subsequent blocks in the stream. I'm not sure why it's generating an IPLD block with zero bytes. I'll try to make a smaller test case. /cc @rvagg

jimpick commented 3 years ago

Here's my data. I see I have some zero length files. I bet that's triggering it.

https://bafybeiatccvommze36x5cy44m22qz4al63riz4p6bnxw4pns6g3lbw3aky.ipfs.dweb.link/

jimpick commented 3 years ago

I created a test program and left an issue over at:

https://github.com/ipld/js-car/issues/46

It appears that when consuming a streaming iterable, @ipld/car sometimes will abort reading prematurely if the latest block has a zero length payload.

rvagg commented 3 years ago

can confirm, this will be will be fixed by https://github.com/ipld/js-car/pull/48

vasco-santos commented 3 years ago

This should now be fixed. @jimpick let us know if not

jimpick commented 3 years ago

Tested. I was able to upload my data!