Closed ikreymer closed 1 year ago
I've tried checking out the bug-report-unixfs-exporter repo, used yarn to install the deps and run the test case, but it passes for me. I also tried updating all the deps to the latest versions and it still passes:
% npm test
> export-test@1.0.0 test
> ava
orig 63504
exported 63504
✔ export matches original file
─
1 test passed
I'm not sure what I'm supposed to be looking for here.
If you are still seeing a problem please re-open this issue and let me know what you're seeing.
You're right, the test as is was passing, as forgot to add options setting chunk size to 32K!
Now updated the test with:
opts = {...opts, chunker: "fixed", maxChunkSize: "32768"};
which appears to be working correctly with the versions of importer/exporter in the repo.
To repro the issue: 1) Update the dependencies to latest version 2) Uncomment:
import { fixedSize } from 'ipfs-unixfs-importer/chunker';
https://github.com/ikreymer/bug-report-unixfs-exporter-issues/blob/main/test-export.js#L10
and uncomment:
opts = {...opts, chunker: fixedSize({"chunkSize": 32768})};
https://github.com/ikreymer/bug-report-unixfs-exporter-issues/blob/main/test-export.js#L32
while commenting out the old opts (https://github.com/ikreymer/bug-report-unixfs-exporter-issues/blob/main/test-export.js#L29)
Now, the tests will not pass and the export will just hang. This only happens with the smaller chunk size, using the default chunk size still works, but previously it also worked with smaller chunk sizes..
There are a few things going on here.
This means the UnixFS DAG created in the concat method reports it's size incorrectly - this is why it hangs, because it thinks there's more file to be read when in fact there isn't.
BigInt
s as in the protobuf they are serialized as uint64
which can overflow the Number
type, but the @ipld/dag-pb
module expects Tsize
as a number
If you try to use BigInt
s for Tsize
you get different bytes back and as such different CIDs.
This is something that needs fixing in @ipld/dag-pb
so until then you need to downcast to Number
.
There was a bug that would cause this sort of DAG to be read out of order - it has been fixed by #299
If you update the concat
method to create a valid UnixFS DAG the test starts to pass:
export async function concat(cids) {
const node = new UnixFS({ type: "file" });
const Links = await Promise.all(
cids.map(async (cid) => {
const Tsize = sizes[cid];
const entry = await exporter(cid, store); // <-- get the size of the child
node.addBlockSize(entry.size)
return {
Name: "",
Hash: cid,
Tsize: Number(Tsize), // <-- convert the BigInt to a Number
};
})
);
const Data = node.marshal();
return await putBlock({Data, Links});
}
Thanks for getting to the bottom of this!
This means the UnixFS DAG created in the concat method reports it's size incorrectly - this is why it hangs, because it thinks there's more file to be read when in fact there isn't.
300 should catch this in future
Ah, yes, I think that was the secondary issue that I ran into in making the test repo, in the original was using the correct size:
I tried making an isolated test case for this issue so its easier to test, and set up: https://github.com/ikreymer/bug-report-unixfs-exporter-issues/blob/main/test-export.js However, in this setup, from 8.0.2, the exporter seems to be abort node and never actually complete, both with ava and just running normally.
The original issue I was having for:
From 8.0.6, exporting of a concatenated file created from files of blocksize of 32K seems to generate wrong results. This happens when running the test suite on https://github.com/webrecorder/ipfs-composite-files/tree/store-ops after updating to ipfs-unixfs-exporter 8.0.6
was this:
- The file created in the test from the chunks is a bit weird in that it's not a balanced DAG - one of the child nodes of the root has children and it's not the first child.
There was a bug that would cause this sort of DAG to be read out of order - it has been fixed by #299
Glad to see this fixed now!
- The latest UnixFS version reports file sizes as
BigInt
s as in the protobuf they are serialized asuint64
which can overflow theNumber
type, but the@ipld/dag-pb
module expectsTsize
as anumber
If you try to use
BigInt
s forTsize
you get different bytes back and as such different CIDs.This is something that needs fixing in
@ipld/dag-pb
so until then you need to downcast toNumber
.
Thanks for the clarification. In trying to do the porting yesterday, I also ran into the BigInt vs Number errors, and wasn't sure what should be BigInt vs Number. Yes, also noticed the different CIDs when using BigInts in Tsize. This isn't great, but hopefully can be fixed in dag-pb eventually..
In the meantime, was able to update to latest via: https://github.com/webrecorder/ipfs-composite-files/pull/9/files
Looks like somewhere in there the cid version default changed from 0 to 1 as well and rawLeaves to true, but easy set explicitly for testing.
I'm seeing some strange issues in ipfs-unixfs-exporter, from 8.0.6, and then also from 8.0.2.
From 8.0.6, exporting of a concatenated file created from files of blocksize of 32K seems to generate wrong results. This happens when running the test suite on https://github.com/webrecorder/ipfs-composite-files/tree/store-ops after updating to ipfs-unixfs-exporter 8.0.6
I tried making an isolated test case for this issue so its easier to test, and set up: https://github.com/ikreymer/bug-report-unixfs-exporter-issues/blob/main/test-export.js However, in this setup, from 8.0.2, the exporter seems to be abort node and never actually complete, both with ava and just running normally. To repro, yarn install and yarn test in: https://github.com/ikreymer/bug-report-unixfs-exporter then upgrade to ipfs-unixfs-exporter@8.0.2 or higher, and run
yarn test
againBecause of the second issue, I couldn't create a clean repro the first one, so noting both here. Updating to the currently latest ipfs-unixfs-exporter@9.0.1 has the same issue in both cases.
This was tested on Node 18.2 and Node 19.0.1 on Mac.