ipfs-inactive / js-ipfs-unixfs-importer

[ARCHIVED] JavaScript implementation of the UnixFs importer used by IPFS
MIT License
5 stars 4 forks source link

feat: support storing metadata in unixfs nodes #39

Closed achingbrain closed 4 years ago

achingbrain commented 4 years ago

Adds mtime and mode properties to {path, content} import entries.

achingbrain commented 4 years ago

File data will end up in a leaf node if the reduceSingleLeafToSelf option is false, same as before.

alanshaw commented 4 years ago

Do you think we should have a default like, do not reduce to self if mtime or mode are not the defaults and file size is beyond some threshold?

alanshaw commented 4 years ago

This is AWESOME btw!

achingbrain commented 4 years ago

Do you think we should have a default like, do not reduce to self if mtime or mode are not the defaults and file size is beyond some threshold?

Maybe, I'm not sure we should make decisions like that for the user, though I could be talked round.

One thing is that in order to work out the size of the node we'd need to serialize it first, then we serialize again inside IPLD when writing it to disk. We already do this to get the size of DAGNodes in order to create DAGLinks so it'd make everything slower to do this in more places.

Something I'd like to try is removing the .dag invocations and using the repo directly for dag-pb operations, at that point we control serialization and hashing in one place and can optimise for the UnixFS use case which doesn't fit well with other IPLD types, mostly because of the DAGLink size property requirement above.

Anyway, I'd like to resolve that in a separate PR to this one, but one that gets resolved before this goes out of the door in a js-IPFS release.

alanshaw commented 4 years ago

Maybe, I'm not sure we should make decisions like that for the user, though I could be talked round.

I think we should provide good defaults. The majority of users are not going to know or care about this so we should make a effort to ensure their datas get good deduplications out of the box.

We don't have to propose in this PR, but it would be good to get a follow up proposal PR so we can start considering it.

Also on the table, just changing the default chunker to rabin or the new buzhash chunker.