NiXium-org / NiXium

Open-Source Infrastructure as Code Management Solution for Multiple Systems designed to be reliable in mission-critical tasks on paranoid and high-security environment.
European Union Public License 1.2
16 stars 3 forks source link

Storage Management #16

Open Kreyren opened 5 months ago

Kreyren commented 5 months ago

Megaissue to figure out the storage management requirements by the infrastructure and users


REQUIREMENTS: A total failure tolerant capacity of expandable 10 TB (ideally 20 TB) at around 1GB/s (ideally 10GB/s) speed

Current design constraints

Impermanence

We are working with the scenario that all systems in the infrastructure are running an Impermanent setup meaning that NixOS manages a bit-by-bit reproducible boot that sets up all services and secrets with all changes that are not explicitly declared in NixOS will be removed on reboot or loss of power.

It's possible to declare files/directories to be persistent across reboots.

To manage the wear by the OS operations to the drives the OS runs in an encrypted filesystem in RAM that loads from an encrypted small block device/flash disk.

Local Caching

To manage the block device wear all client systems are expected to have a sacrificial high-speed encrypted hot storage device e.g. SSD that caches the frequently accessed data from the storage solution, but this is not a hard constraint in case better solution is found.

Economy

This is for a personal use in an environment that is not designed for profit -> The more economical the solution is the better.

Considered High Security Environment + Paranoid setup

The infrastructure management is already designed with as high security and privacy as possible -> It's important that the solution doesn't disturb or limits that.

Energy Efficiency requirement

There is only so much power that my house is able to generate without taking the power from the grid through e.g. solar panels so the solution is expected to be very energy efficient.

Solution 1: RAID 5 + Encrypted archiving of parity data on a secure backup + Caching

RAID-5 setup consisting of 3x5TB drives where the eqvivalent of 5TB drive is used for parity either mechanical drive or SSD depending on what's more economical.

IF: it's possible to get only the parity data into an encrypted archive to be uploaded to a 3rd party service so that even if the storage server is destroyed the data can always be recovered through recalculating the parity and using sacrificial hot storage device e.g. SSD for caching for frequently accessed files to reduce the stress on the cold storage.

THEN: Is this a good solution?

Since we are uploading the parity data to a secure backup that can always be recovered -> it doesn't make sense to do RAID-6 which would only increase the economical demands.

UPDATE: This doesn't seem possible or have any advantage over just backing up the whole storage and creating an encrypted archive to send to a secure backup

ELSE: Not an option?

Solution 2: bcachefs?

TBD https://bcachefs.org

Solution 3: ZFS + VDEV

because hdds are actually ok at the bulk data part; especially if you stripe them together. It's the random seeking for small blocks like metadata that makes hdds atrocious [and by keeping the metadata on an SSD that manages that problem]

zfs's own encryption feature should be avoided. But sure, a few hdds encrypted with luks and a special allocation vdev (note, vdev is a generic term that applies to all drives in a zfs pool; special allocation vdev is a particular type).

https://github.com/ElvishJerricco/stage1-tpm-tailscale

Relevants

  1. Level1Tech Video "Hardware Raid is Dead and is a Bad Idea in 2022" -- https://www.youtube.com/watch?v=l55GfAwa8RI
  2. Level1Tech Video "So if Hardware RAID is dead.. then what?" -- https://youtu.be/Q_JOtEBFHDs
  3. Matrix Discussion on the topic in NixOS-Offtopic -- https://matrix.to/#/!sgkZKRutwatDMkYBHU:nixos.org/$6R7L6CpQXg3aWM-iCuNHpD3oOnVTmErxbquUwviGxbg?via=nixos.org&via=matrix.org&via=nixos.dev
Kreyren commented 3 months ago

Might be relevant: https://github.com/spacedriveapp/spacedrive#what-is-a-vdfs