distr1 / distri

a Linux distribution to research fast package management
https://distr1.org
Other
532 stars 28 forks source link

Use vacfs and venti for append-only package store #90

Closed TheOrangeCat closed 2 years ago

TheOrangeCat commented 2 years ago

This is more of an interesting idea I got, rather than a practical suggestion.

vacfs would be the replacement for squashfs for mounting packages. plan9port vacfs uses fuse for mounting, so that shouldn't be a problem.

Venti's nature allows for easy and seamless deduplication, which would be great for having multiple versions of a package.

However, for this to actually be usable it's probably a good idea to write a new venti server (using govt or similar go module) that instead of using (relatively) big fixed-size disk devices for the index and data log uses small files on the file system that it slowly grows as data gets written.

Again, this is more of an interesting concept than an actual idea to implement, writing a new venti server and adapting the existing package system to use it would probably be a huge undertaking. Also, it will probably slow down package installation a lot if not running the index on an SSD and/or using a bloom filter.

stapelberg commented 2 years ago

Thanks for sharing your idea.

De-duplication might be nice, but in practice I don’t think it’s worth it unless in extreme cases. Having a few similar program versions installed concurrently does not pose an extreme enough case, I think.

vacfs would be the replacement for squashfs for mounting packages. plan9port vacfs uses fuse for mounting, so that shouldn't be a problem.

Note that distri doesn’t use one squashfs mount per package. Instead, distri uses a single FUSE mount (/ro) which is backed by a distri-specific fuse daemon which then transparently provides the contents of the SquashFS images (one subdirectory maps to one SquashFS image).

So, what you’d need to do is teach the distri fuse daemon how to read vac files instead of SquashFS files. But then you’d also need to update the rest of the tooling that deals with distri SquashFS images.

Considering that de-duplication is the only advantage of vac over SquashFS, I don’t think that’d be a good trade-off :)

TheOrangeCat commented 2 years ago

Yeah, that was more of just an interesting possibility, I think it's fine to keep distri in its current form as is. vac files just store a fingerprint of the block storing the directory, so most logic is dealing with venti. The only real reason for using this approach would be storing many versions of one package and/or many packages sharing a lot of data, which is quite specific and is probably not useful to most users.