web3-storage / web3.storage

DEPRECATED ⁂ The simple file storage service for IPFS & Filecoin
https://web3.storage
Other
505 stars 119 forks source link

throw new Error('Unexpected end of data'); #840

Closed AllanOricil closed 1 year ago

AllanOricil commented 2 years ago

Im getting this type of error quite often. What could be the cause for that?

image

These are the files Im downloading. To secure my file, I divided it into multiple parts, then encrypted the parts with aes, each using its own iv. To retrieve it, I have to download all parts, decrypt them, and finally join them together. But for some reason I keep getting this error while downloading. Sometimes everything works without a problem.

image

AllanOricil commented 2 years ago

To show it is random, here are other outputs

image

image

AllanOricil commented 2 years ago

And without changing anything in the code, it now worked:

image

AllanOricil commented 2 years ago

It was probably an instability on your servers/providers. But I would like a revision on that part of the code. Im using the lib as this:

const { Web3Storage } = require('web3.storage');

const web3StorageClient = new Web3Storage({
  token: process.env.WEB3STORAGE_TOKEN,
});

web3StorageClient
      .get(cid)
      .then((response) => {
        //Im doing stuff here
      })
      .catch((error) =>

      );
  });

Even with a catch, my server stops when I have this type of exception.

const { Web3Storage } = require('web3.storage');

const web3StorageClient = new Web3Storage({
  token: process.env.WEB3STORAGE_TOKEN,
});

try{
web3StorageClient
      .get(cid)
      .then((response) => {
        //Im doing stuff here
      })

  });
}catch(error){ 
//I was expecting it to not kill my server as I have wrapped it with a try catch. But it does kill the server
}
alanshaw commented 2 years ago

It depends what you're doing in //Im doing stuff here - it could be throwing asynchronously...

Want to share the code?

AllanOricil commented 2 years ago

@alanshaw I dont think it is related to my code because of what it is printed in the stack trace. Like I explained and showed with the several print screens, the exception happens randomly, and it comes from the @ipld module that Web3Storage uses. Please, take a look at the stack trace that is shown in one of the images.

But here is the code:

This is the lib I used to convert Web Stream API to Node StreamAPI, specifically the Readable Stream https://www.npmjs.com/package/readable-web-to-node-stream

const fs = require('fs');
const path = require('path');
const { ReadableWebToNodeStream } = require('readable-web-to-node-stream');
const { Web3Storage } = require('web3.storage');

const web3StorageClient = new Web3Storage({
  token: process.env.WEB3STORAGE_TOKEN,
});

const writeFilePartToDisk = (readStream, cid, name, number) => {
  return new Promise((resolve, reject) => {
    const writeStream = fs.createWriteStream(
      path.resolve(__dirname, `${name}`)
    );
    readStream.pipe(writeStream);
    let error = null;

    writeStream.on('error', (err) => {
      error = err;
      writeStream.close();
      console.log(
        `ERROR - File ${name} CID: ${cid} Part: ${number} could not be saved on disk`
      );
      reject(err);
    });

    writeStream.on('close', () => {
      if (!error) {
        console.log(
          `File ${name} CID: ${cid} Part: ${number} saved on disk successfully`
        );
        resolve();
      }
    });
  });
};

const downloadFilePart = ({ cid, name, number }) => {
  return new Promise((resolve, reject) => {
    console.log(`Downloading CID: ${cid} PART: ${number}`);
    web3StorageClient
      .get(cid)
      .then((response) => {
        if (!response.ok) {
          reject(new Error(`ERROR - Failed to get ${cid}`));
        }
        response.files().then((files) => {
          const readStream = new ReadableWebToNodeStream(files[0].stream());
          writeFilePartToDisk(readStream, cid, name, number)
            .then(() => {
              console.log(`Writing ${name} to disk`);
              resolve();
            })
            .catch(() =>
              reject(
                new Error(
                  `ERROR - Failed to write ${name}  CID: ${cid} Part: ${number} on disk`
                )
              )
            );
        });
      })
      .catch(() =>
        reject(new Error(`ERROR - File ${name} CID: ${cid} Part: ${number}`))
      );
  });
};

Hope you can help :D

This is just a POC. Im going to clean this code later. But if you need anything else, Im happy to help.

insanity54 commented 2 years ago

I'm also seeing this issue on large MP4 files. Smaller files such as a 181MB MP4 does not seem affected. However, a 3.4GB MP4 is affected.

For me, it's not the upload that is affected. The problem lies in getting Web3File instances of a content archive that was previously uploaded to web3.storage.

web3.storage hangs for a long time (30+ minutes) when calling Web3Response.files() then I get Error: Unexpected end of data.

const storage = new Web3Storage({ token });
const res = await storage.get(rootCid);
const ipfsFiles = await res.files(); // --> Error: Unexpected end of data

It's not just a problem with Web3Response.files(). I also get this error when using Web3Response.unixFsIterator(). I used the example code from https://docs.web3.storage/reference/js-client-library#return-value-1

const res = await client.get(cid)
for await (const entry of res.unixFsIterator()) {
  console.log(`got unixfs of type ${entry.type}. cid: ${entry.cid} path: ${entry.path}`)  // <-- I never see this console log... `Error: Unexpected end of data` is thrown after a very long hang.
  // entry.content() returns another async iterator for the chunked file contents
  for await (const chunk of entry.content()) {
    console.log(`got a chunk of ${chunk.size} bytes of data`)
  }
}
AllanOricil commented 2 years ago

@insanity54 I fixed it using a Gateway link to download the file. The bug is comming from a module that Web3storage uses, so there isnt much they can do.

vasco-santos commented 2 years ago

@AllanOricil can you provide us more information on the module failing here?

AllanOricil commented 2 years ago

@vasco-santos it is the @ipld/car/cjs/lib/decoder.js One of the prints I posted here has the line where the exception happens

olizilla commented 2 years ago

It's not super obvious but in the current implementation, calling client.get(<cid>) on the web3.storage client hits GET /cid/:cid on https://api.web3.storage which in turn is a proxy for fetching the CAR for that CID from https://ipfs.io. The client fetches all the blocks as a single CAR, verifies each one matches its CID and re-assembles the files. But requests for large files can end up rate limited by the public gateway. We'll have a better solution for this soon.

In the meantime, as a workaround, you could fetch your file by CID directly from https://nftstorage.link – it races requests between multiple gateways, and caches everything it can. A regular fetch would work here.

const cid = "<your cid here>"
const res = await fetch(`https://nftstroage.link/ipfs/${cid}`)

the trade off for now is that your client side code is no longer verifying the blocks. But under the hood that nftstorage.link is only using trusted gateways provideded by ipfs.io, pinata and cloudflare to do the work of fetching the content from IPFS.