Open Kreyren opened 3 months ago
Hey @flokli, can you please elaborate on the state of tvix? Thanks! <3
Hey @Kreyren, you're welcome to follow the state of Tvix on the various blog posts, changes in the repository, and other public information - I assume you understand I cannot provide individual status updates in various issues elsewhere. Thanks!
@flokli will read through them, thanks for info!
@TanvirOnGH Anything relevant on the state of tvix and it's functional implemention is welcomed. If it's just drop in replacement to the current Nix daemon with worst case few unimportant features missing then it's actionable.
Note that the backend has higher priority rn as i am working on a solution to replace the amazon-hosted cache by nix and instead use decentralized distribution via bittorrent or alike so that when user requests cache it would be fetched via basically a mesh network that distributes the files bit by bit.
relevant: https://github.com/NixOS/nix/issues/859
ideally bittorrent over i2p or some solution alike for security and privacy?
This issue is work in progres..
This issue ignores the https://github.com/Kreyren/kreyren/issues/111 -relevant content as that is still on-going and currently can't be used for an objective review.
The idea behind this meta-issue is to permanently address the unfixable issues in both GNU Guix and NixOS by implementing a nix-based distribution that addresses the observed issues:
GNU Guix GNU/Linux:
Organization-wise the project is very mismanaged resulting very buggy and unreliable Operating System that in production requires more babysitting than feasable.
The lisp implementation used is GNU Guile which as observed is very over-complicated and limited on functionality in comparison to NixOS.
Their community management is insufficient as they expect everyone to use IRC and submit patches via e-mail while being very toxic against any proprietary software mention including those that can't be managed in a reasonable time e.g. Intel Microcode, wifi drivers, etc..
Their source code management is outdated as they lack any kind of implementation for Ci/CD and use GNU Savannah for the git forge which is very minimal and not friendly to new developers.
NixOS
Nix language is terrible implementation for it's designed workflow, it's constantly encountering infinite recursion issues when the code gets more complex that takes unreasonable amount of human resources to be managed e.g. https://discourse.nixos.org/t/how-to-correctly-implement-release-flexible-nixos-modules/49869 encountered at https://github.com/NiXium-org/NiXium/pull/124#issuecomment-2258539202 and as a result the current source code is still affected by it https://github.com/NiXium-org/NiXium/blob/35dc1a258134234f1601c6124bd4881ef1ba7567/src/nixos/machines/tupac/config/disks.nix#L29-L30.
Additionally the Nix Language is no where near the flexibility and functionality of scheme-based languages resulting in a code that is more complex than it has to be and that is often very difficult to make work for the desired workload e.g. trying to get a different package version often requires making a custom package definition as the derivation providing the package is inherently not compatible with different versions of the package or having to pin a nixpkgs commit that provides the version.
Their community management forces cultures that hate each other in a collaborative work without being the required impartial judge to try to manage it which is prone to constantly cause conflicts and that is managed insufficiently as evident by the Anduril crisis with the current NixOS management refusing to learn from these mistakes and causing an exodus of maintainers and developers.
That said there are things done right which this issue aims to learn and inspire from namely:
GNU Guile deep integration with the GNU Guix
The use of proper frameworking language enables the distribution to function as the de-factor borg that can be infinitely expanded with built-in functionality instead of trying to make two components from two different worlds work with each other.
Namely the integration of the Init/service manager of GNU Shepherd on guix and SystemD on NixOS:
As observed the GNU Shepherd can integrate without having to make a duplicate function and in a way that can directly use the features from either guix or shepherd seamlessly resulting in a significantly smaller footprint and more functional integration that doesn't limit innovation like e.g. https://github.com/NixOS/nixpkgs/pull/324911#issuecomment-2274487337.
In comparison to the SystemD used in NixOS which is binded through an API and is very overcomplicated with features that NixOS will likely never use that only poses an increased risk of security vulnerability and difficulty of adaptation for the workload.
The guix's implementation also expands to 3rd party services such as gitile which is a gitea-inspired forge de-facto turning the package manager into having a built-in git forge or even an authoritative DNS server.
NixOS has a better management
NixOS is funded by the community through an open-collective which is done in a transparent way and thus does not have major issues with funding and resources like guix.
Nix also has CI/CD to capture common issues and is overall in an another league in terms of reliability to be an acceptable solution for production and mission-critical environment thus objectively better solution despite doing a worse design decisions in comparison to guix.
Considered lisp implementations
Common-Lisp ("CL")
Which was already attempted as implemented by a community member infinisil: https://github.com/infinisil/nixlisp, but CL is not a good fit for a functional language as it's expressions are more aimed to be an object oriented and have to be integrated in a very complex way in comparison to scheme.
GNU Guile
Overall not a bad option, but it's lacking on the integration with the IDE and emacs in terms of documentation making the language painfully difficult to learn and use correctly. Additionally it lacks a needed functionality for this task as well as e.g. it's not possible to attach a docstring with a default value to a variable and upstream devs do not seem to be interested in trying to make the documentation better either.
So for our usecase it would have to be forked and it would be a lot of work to re-integrate the documentation to be at least on elisp-levels.
Steel
TBD https://github.com/mattwparas/steel/issues/259
Tvix is a Nix rewrite in rust - https://github.com/tvlfyi/tvix
State an usability of the project is unknown
Current major problem is that they use GPLv3 license, which might limit our use.
Infrastructure Management
Required roles
This is a writeup of the required roles while some systems will be able to do multiple roles at the same time.
Compute Server
NiXium is currently oriented around thin clients that are focused on battery life at the cost of performance with a set of minimal required features to be usable as a thin clients that rely on a remote accessible server for compute of compilation and other related tasks (blender rendering, etc..)
This server is expected to be very power hungry so we need something that can suspend itself when it's not in use.
Current Research Device: Morph.
Control Server
Power-Efficient always-on Server that is used to send commands to the other devices on it's relevant local network e.g. awaken commands to the compute server.
Additionally control server can be used to handle power efficient tasks like home assistant.
Current Research Device: Mracek
Storage Server
System with expandable storage connected to the local network to provide this storage access to all relevant system and remotely if needs be.
Kreyren's Personal Hardened Thin Client
Super Administrator's device that is hardened and is used to control the infrastructure.
Current Research Device: Tsvetan (Not yet submitted in central branch)
Tsvetan aka the OLIMEX Teres-I was selected as it's Open-Source Hardware Device that runs Open-Source Software and Firmware making it very flexible for various implementations. The issue is that it has a very slow storage and low amount of RAM making it sub-optimal for this use.
Current plan with this is to implement the System on Module standard that was finished on 8th November 2024 to make it economical and efficient to fabricate in a hackerspace environment with the ability to change the BGA chips.
OLIMEX open-sourced the iMX8MP chips (https://www.olimex.com/Products/SOM/NXP-iMX8/iMX8MP-SOM-4GB-IND/open-source-hardware) which seem to be sufficient for out use, but consult with manufacturers for options.
Ideal solution would be getac-like rugged system (https://youtu.be/7-ikjUWJ4Vs) with hot-swappable battery of 2x99Wh, two MxM slots for dedicated GPU and arm (riscv is considered too much of a liability rn) CPU.
Ideally dual RTX4090m that is underclocked unless the device is connected to the external water cooler or maybe dual Intel A380M configured as multiGPU.
To be Moved in separate tracking..
AI Server
TBD