ipfs / js-ipfs-unixfs

JavaScript implementation of IPFS' unixfs (a Unix FileSystem representation on top of a MerkleDAG)
Other
87 stars 35 forks source link

Strange issues in ipfs-unixfs-exporter 8.0.x #269

Closed ikreymer closed 1 year ago

ikreymer commented 2 years ago

I'm seeing some strange issues in ipfs-unixfs-exporter, from 8.0.6, and then also from 8.0.2.

Because of the second issue, I couldn't create a clean repro the first one, so noting both here. Updating to the currently latest ipfs-unixfs-exporter@9.0.1 has the same issue in both cases.

This was tested on Node 18.2 and Node 19.0.1 on Mac.

achingbrain commented 1 year ago

I've tried checking out the bug-report-unixfs-exporter repo, used yarn to install the deps and run the test case, but it passes for me. I also tried updating all the deps to the latest versions and it still passes:

% npm test

> export-test@1.0.0 test
> ava

orig 63504
exported 63504
  ✔ export matches original file
  ─

  1 test passed

I'm not sure what I'm supposed to be looking for here.

If you are still seeing a problem please re-open this issue and let me know what you're seeing.

ikreymer commented 1 year ago

You're right, the test as is was passing, as forgot to add options setting chunk size to 32K!

Now updated the test with:

opts = {...opts, chunker: "fixed", maxChunkSize: "32768"};

which appears to be working correctly with the versions of importer/exporter in the repo.

To repro the issue: 1) Update the dependencies to latest version 2) Uncomment:

 import { fixedSize } from 'ipfs-unixfs-importer/chunker';

https://github.com/ikreymer/bug-report-unixfs-exporter-issues/blob/main/test-export.js#L10

and uncomment:

 opts = {...opts, chunker: fixedSize({"chunkSize": 32768})};

https://github.com/ikreymer/bug-report-unixfs-exporter-issues/blob/main/test-export.js#L32

while commenting out the old opts (https://github.com/ikreymer/bug-report-unixfs-exporter-issues/blob/main/test-export.js#L29)

Now, the tests will not pass and the export will just hang. This only happens with the smaller chunk size, using the default chunk size still works, but previously it also worked with smaller chunk sizes..

achingbrain commented 1 year ago

There are a few things going on here.

This means the UnixFS DAG created in the concat method reports it's size incorrectly - this is why it hangs, because it thinks there's more file to be read when in fact there isn't.

300 should catch this in future

If you try to use BigInts for Tsize you get different bytes back and as such different CIDs.

This is something that needs fixing in @ipld/dag-pb so until then you need to downcast to Number.

There was a bug that would cause this sort of DAG to be read out of order - it has been fixed by #299

If you update the concat method to create a valid UnixFS DAG the test starts to pass:

export async function concat(cids) {
  const node = new UnixFS({ type: "file" });

  const Links = await Promise.all(
    cids.map(async (cid) => {
      const Tsize = sizes[cid];

      const entry = await exporter(cid, store);  //  <-- get the size of the child
      node.addBlockSize(entry.size)

      return {
        Name: "",
        Hash: cid,
        Tsize: Number(Tsize),   //  <-- convert the BigInt to a Number
      };
    })
  );

  const Data = node.marshal();

  return await putBlock({Data, Links});
}
ikreymer commented 1 year ago

Thanks for getting to the bottom of this!

This means the UnixFS DAG created in the concat method reports it's size incorrectly - this is why it hangs, because it thinks there's more file to be read when in fact there isn't.

300 should catch this in future

Ah, yes, I think that was the secondary issue that I ran into in making the test repo, in the original was using the correct size:

I tried making an isolated test case for this issue so its easier to test, and set up: https://github.com/ikreymer/bug-report-unixfs-exporter-issues/blob/main/test-export.js However, in this setup, from 8.0.2, the exporter seems to be abort node and never actually complete, both with ava and just running normally.

The original issue I was having for:

From 8.0.6, exporting of a concatenated file created from files of blocksize of 32K seems to generate wrong results. This happens when running the test suite on https://github.com/webrecorder/ipfs-composite-files/tree/store-ops after updating to ipfs-unixfs-exporter 8.0.6

was this:

  • The file created in the test from the chunks is a bit weird in that it's not a balanced DAG - one of the child nodes of the root has children and it's not the first child.

There was a bug that would cause this sort of DAG to be read out of order - it has been fixed by #299

Glad to see this fixed now!

  • The latest UnixFS version reports file sizes as BigInts as in the protobuf they are serialized as uint64 which can overflow the Number type, but the @ipld/dag-pb module expects Tsize as a number

If you try to use BigInts for Tsize you get different bytes back and as such different CIDs.

This is something that needs fixing in @ipld/dag-pb so until then you need to downcast to Number.

Thanks for the clarification. In trying to do the porting yesterday, I also ran into the BigInt vs Number errors, and wasn't sure what should be BigInt vs Number. Yes, also noticed the different CIDs when using BigInts in Tsize. This isn't great, but hopefully can be fixed in dag-pb eventually..

In the meantime, was able to update to latest via: https://github.com/webrecorder/ipfs-composite-files/pull/9/files

Looks like somewhere in there the cid version default changed from 0 to 1 as well and rawLeaves to true, but easy set explicitly for testing.