On Stage() we're currently do the following things before adding the file to ipfs, all of them requiring a seekable stream:
We need the size of the stream. I build in a generic way that does not rely on files but on streams.
Streams don't have a size and the only way to know is to either read all or seek to the end.
The hash of the file is generated. This is used either as encryption key or to check if we need to do more work (if the content hash did not change, there's no need to stage the file).
Part of the header is used to guess the compression algorithm we should use.
We probably can live without the compression guessing. Getting rid of the hash is harder, then we need another encryption guessing mechanism. Current scheme is: If a file was initially added, compute the content hash+size and derive the key from it. If the file is modified the existing key is taken over. The idea here was that we want to enable de-duplication for same or similar files. This is however defeated by the way our encryption works: Each blocks influences the next block. So changes will result in a avalanche of changed cipher text. A single bit changed in the start of the file will therefore result in a completely different ciphertext. My plan to fix is to either wait for encryption support in IPFS or build some block layer in brig ourselves.
Until that happens (far future in any case) we should switch back to random key generation, which would allow us to get rid of the seekable stream requirement, also boosting performance since we need to read the stream only once.
On
Stage()
we're currently do the following things before adding the file toipfs
, all of them requiring a seekable stream:See this function as reference.
We probably can live without the compression guessing. Getting rid of the hash is harder, then we need another encryption guessing mechanism. Current scheme is: If a file was initially added, compute the content hash+size and derive the key from it. If the file is modified the existing key is taken over. The idea here was that we want to enable de-duplication for same or similar files. This is however defeated by the way our encryption works: Each blocks influences the next block. So changes will result in a avalanche of changed cipher text. A single bit changed in the start of the file will therefore result in a completely different ciphertext. My plan to fix is to either wait for encryption support in IPFS or build some block layer in brig ourselves.
Until that happens (far future in any case) we should switch back to random key generation, which would allow us to get rid of the seekable stream requirement, also boosting performance since we need to read the stream only once.