Make `.narinfo` files JSON

Ericson2314 commented 4 days ago

I don't like the line-oriented format. It is hard to extend, and narinfo / "path info" particular is a bad format that combines too many separate concerns into one hodgepodge.

If we upload a *.narinfo.json file too alongside the old one, we can support old and new Nix and eventually drop the old format. And we can also evolve the JSON format over time.

This is strictly-speaking orthogonal to CA derivations, but a better division of labor between "what is the data" and "where did this data come from", which is the main motivation for reworking the path info format, largely is enabled by CA derivations.

edolstra commented 3 days ago

If we upload a *.narinfo.json file too alongside the old one

No, we shouldn't do that. Then JSON-capable clients would have to query .narinfo.json first and then .narinfo if it doesn't exist, requiring two HTTP requests.

A simpler approach: if the first character of the .narinfo file is {, then it shall be interpreted as a JSON object.

But realistically, it's about 10 years too late to propose this change. We have to keep support for the old format anyway, so adding a new format doesn't actually provide any benefits.

Ericson2314 commented 3 days ago

We have to keep support for the old format anyway, so adding a new format doesn't actually provide any benefits.

We do, but other software doesn't. Lots of people are writing other binary caches, other Nixes, etc., and I want to make that as easy as possible.

For the rest of the comment, yes there is a tradeoff between always avoiding 2 HTTP gets, and having the smoothest migration. But I say we simply decide which of those we care about more.

I am not so worried about the 2 HTTP gets as we do a lot of requests already (and doesn't HTTP 2+ help with this?), and over time most people will be downloading new stuff.

roberth commented 3 days ago

"Make it JSON" could be done in a couple of steps. I like Eelco's suggestion of only changing the file contents, and not adding extra "files" to the binary cache interface.

Add JSON field, such that Foo: Bar\nJSON: {"a":"b"} is equivalent to e.g. {"foo":"Bar", "a":"b"}. Extensions to the .narinfo format can now use semi-structured data :tada:
Make Nix capable of reading nice, fully-json .narinfo files that start with { (no JSON: field)
Add a flag to always generate fully-json narinfos. Users of Nix can now migrate/deprecate/partially-implement/etc.

I am not so worried about the 2 HTTP gets

I'd rather spend the inevitable rate limiting budgets on speculative queries that actually benefit latency at the closure scale, instead of hoping for an insignificant regression in performance.

Ericson2314 commented 3 days ago

Make Nix capable of reading nice, fully-json .narinfo files that start with { (no JSON: field)

Can we agree on starting with that? That means even if it's a while before we do anything in the write direction, many versions of Nix (in the past, from the vantage point of then) in the read direction will become ready.

roberth commented 3 days ago

cc @domenkozar

flokli commented 15 hours ago

There already is text/x-nix-narinfo as Content-Type being sent, so figuring out the type could be done doing that, rather than having to peek at the first byte.

I'm however really not convinced adding more variability buys us much here.

Other implementions also need to still support reading NARInfo in the current format to be able to interpret existing store paths. The old ones are not going to magically rewrite itself, and some store paths (like FODs for sources of rarely updated software) stay around for long.
Binary caches cannot start sending JSON NARInfo without having to make sure clients support it. This means either waiting a few years after this change is in, or doing some content negotiation, taking away the simplicity of the current approach.
There's been 10 years of this format being out in the wild, there's parsers and writers in Rust, Golang, Haskell and probably a few more. Parsing and writing these files is a bit inconvenient, but the community ended up making these formats accessible, and that's not the problem anymore.

If we were to change this stuff, I'd rather introduce a new version of the binary cache protocol (we have nix-cache-info to signal that). And then change a few more things, including fields, signatures and probably a few more things, so it's a more breaking change for sure.

Ericson2314 commented 11 hours ago

@flokli Well not every project is going to care about existing binary caches / existing stuff. I want to make it so you can just support the latest nix / stuff, and implement a lot less. It might not matter for Tvix which has already paid the sunk cost, but it could matter for other things. I want to blur the line between Nix-compatible things and Nix-like things.

If we were to change this stuff, I'd rather introduce a new version of the binary cache protocol (we have nix-cache-info to signal that). And then change a few more things, including fields, signatures and probably a few more things, so it's a more breaking change for sure.

I would be more than happy to do that too.

NixOS / nix

Make `.narinfo` files JSON #11898