Open-Source Infrastructure as Code Management Solution for Multiple Systems designed to be reliable in mission-critical tasks on paranoid and high-security environment.
Megaissue to figure out the storage management requirements by the infrastructure and users
REQUIREMENTS: A total failure tolerant capacity of expandable 10 TB (ideally 20 TB) at around 1GB/s (ideally 10GB/s) speed
Current design constraints
Impermanence
We are working with the scenario that all systems in the infrastructure are running an Impermanent setup meaning that NixOS manages a bit-by-bit reproducible boot that sets up all services and secrets with all changes that are not explicitly declared in NixOS will be removed on reboot or loss of power.
It's possible to declare files/directories to be persistent across reboots.
To manage the wear by the OS operations to the drives the OS runs in an encrypted filesystem in RAM that loads from an encrypted small block device/flash disk.
Local Caching
To manage the block device wear all client systems are expected to have a sacrificial high-speed encrypted hot storage device e.g. SSD that caches the frequently accessed data from the storage solution, but this is not a hard constraint in case better solution is found.
Economy
This is for a personal use in an environment that is not designed for profit -> The more economical the solution is the better.
Considered High Security Environment + Paranoid setup
The infrastructure management is already designed with as high security and privacy as possible -> It's important that the solution doesn't disturb or limits that.
Energy Efficiency requirement
There is only so much power that my house is able to generate without taking the power from the grid through e.g. solar panels so the solution is expected to be very energy efficient.
Solution 1: RAID 5 + Encrypted archiving of parity data on a secure backup + Caching
RAID-5 setup consisting of 3x5TB drives where the eqvivalent of 5TB drive is used for parity either mechanical drive or SSD depending on what's more economical.
IF: it's possible to get only the parity data into an encrypted archive to be uploaded to a 3rd party service so that even if the storage server is destroyed the data can always be recovered through recalculating the parity and using sacrificial hot storage device e.g. SSD for caching for frequently accessed files to reduce the stress on the cold storage.
THEN:
Is this a good solution?
Since we are uploading the parity data to a secure backup that can always be recovered -> it doesn't make sense to do RAID-6 which would only increase the economical demands.
UPDATE: This doesn't seem possible or have any advantage over just backing up the whole storage and creating an encrypted archive to send to a secure backup
because hdds are actually ok at the bulk data part; especially if you stripe them together. It's the random seeking for small blocks like metadata that makes hdds atrocious [and by keeping the metadata on an SSD that manages that problem]
zfs's own encryption feature should be avoided. But sure, a few hdds encrypted with luks and a special allocation vdev (note, vdev is a generic term that applies to all drives in a zfs pool; special allocation vdev is a particular type).
Megaissue to figure out the storage management requirements by the infrastructure and users
REQUIREMENTS: A total failure tolerant capacity of expandable 10 TB (ideally 20 TB) at around 1GB/s (ideally 10GB/s) speed
Current design constraints
Impermanence
We are working with the scenario that all systems in the infrastructure are running an Impermanent setup meaning that NixOS manages a bit-by-bit reproducible boot that sets up all services and secrets with all changes that are not explicitly declared in NixOS will be removed on reboot or loss of power.
It's possible to declare files/directories to be persistent across reboots.
To manage the wear by the OS operations to the drives the OS runs in an encrypted filesystem in RAM that loads from an encrypted small block device/flash disk.
Local Caching
To manage the block device wear all client systems are expected to have a sacrificial high-speed encrypted hot storage device e.g. SSD that caches the frequently accessed data from the storage solution, but this is not a hard constraint in case better solution is found.
Economy
This is for a personal use in an environment that is not designed for profit -> The more economical the solution is the better.
Considered High Security Environment + Paranoid setup
The infrastructure management is already designed with as high security and privacy as possible -> It's important that the solution doesn't disturb or limits that.
Energy Efficiency requirement
There is only so much power that my house is able to generate without taking the power from the grid through e.g. solar panels so the solution is expected to be very energy efficient.
Solution 1: RAID 5 + Encrypted archiving of parity data on a secure backup + Caching
RAID-5 setup consisting of 3x5TB drives where the eqvivalent of 5TB drive is used for parity either mechanical drive or SSD depending on what's more economical.
IF: it's possible to get only the parity data into an encrypted archive to be uploaded to a 3rd party service so that even if the storage server is destroyed the data can always be recovered through recalculating the parity and using sacrificial hot storage device e.g. SSD for caching for frequently accessed files to reduce the stress on the cold storage.
THEN: Is this a good solution?
Since we are uploading the parity data to a secure backup that can always be recovered -> it doesn't make sense to do RAID-6 which would only increase the economical demands.
UPDATE: This doesn't seem possible or have any advantage over just backing up the whole storage and creating an encrypted archive to send to a secure backup
ELSE: Not an option?
Solution 2: bcachefs?
TBD https://bcachefs.org
Solution 3: ZFS + VDEV
https://github.com/ElvishJerricco/stage1-tpm-tailscale
Relevants