philipl / pifs

πfs - the data-free filesystem!
GNU General Public License v3.0
6.69k stars 290 forks source link

Shorthand encoding for positions #68

Open Heath123 opened 2 years ago

Heath123 commented 2 years ago

Say we wanted to encode "123". The first occurrence of this is Pi is at position 1924. However, a shorter way to encode this would be to store "123", which is shorthand for "the byte at the position in Pi where the byte 123 can be found". This stores the same data as storing "1924", but in a shorter form. This also skips the costly Pi lookup step, drastically improving performance.

For example, the byte sequence:

FA 01 7A D7 12 0B

would be encoded as:

FA 01 7A D7 12 0B

A function to convert between plain bytes and shorthand Pi offsets could look like this pseudocode:

char encode(char original) {
  return byteAtPiPosition(findByteInPi(original)):
}

char decode(char encoded) {
  return byteAtPiPosition(findByteInPi(encoded)):
}

However, we can skip some steps here, for an optimised version:

char encode(char original) {
  return original:
}

char decode(char encoded) {
  return encoded:
}

This would bring many of the advantages of traditional filesystems to PiFS, such as high performance, and reduces the size of the metadata.

As a bonus, this encoding is fully compatible with traditional filesystem drivers, due to the output metadata being readable as if it were the original data. Therefore, you don't even have to reformat your disk to use this new implementation of PiFS!

But wait, it gets even better! All we have to do to add support for PiFS to existing drivers, such as EXT4 and NTFS, is to inject the two encode and decode functions into wherever the drivers write and read to the disk. So, a read like this:

int var = read_from_disk(position);

will have to be changed to this:

int var = decode(read_from_disk(position));

If we mark the functions with always_inline, or allow the compiler to automatically inline the functions, then it will get converted to this:

int var = read_from_disk(position);

You may notice that this is completely identical to the original code! This means that we can skip the step of modifying, recompiling and replacing the code entirely!

Here is a simple 0-step tutorial to switch to from a traditional filesystem to this new version of PiFS:

-

And you're done!

Also, since you do not need to modify the code, this even works on proprietary drivers like the Windows NTFS one. In fact, you have already been using it for as long as you're been using a computer, without even knowing it. Amazing!