haskell-nix / hnix-store

Haskell implementation of the Nix store
Apache License 2.0
83 stars 23 forks source link

Implement AddStoreToNar operation #265

Closed squalus closed 7 months ago

squalus commented 8 months ago

My use case is that I have a bunch of NARs and narinfos and I'd like to use hnix-store-remote to add the NARs to the Nix Store. To do this, hnix-store-remote needs to support the addToStoreNar operation. It also must not read the entire NAR into memory, since the NARs can be quite large.

I have some code that does this with conduit. I imagine using ByteString.Lazy would be a better fit for this project though.

addToStoreNar support was removed in #75

sorki commented 8 months ago

The current addToStore is actually addToStoreNar. There was only some renaming I think as the old addToStore for adding without NARs is no longer supported by Nix.

Regarding conduit - I think this can be abstracted enough so that anything goes. I'll get to this sometimes this week when I'm done with the protocol stuff.

squalus commented 8 months ago

The current addToStore in Remote.hs uses the AddToStore (opnum 7) operation, not AddToStoreNar (opnum 39). Unless I'm reading something wrong.

https://github.com/haskell-nix/hnix-store/blob/d8e337513f66735dae3048f1ca2f0cf0af7eeb67/hnix-store-remote/src/System/Nix/Store/Remote.hs#L89 https://github.com/haskell-nix/hnix-store/blob/d8e337513f66735dae3048f1ca2f0cf0af7eeb67/hnix-store-remote/src/System/Nix/Store/Remote/Protocol.hs#L95

squalus commented 8 months ago

I have some partial work and a failing test here: https://github.com/haskell-nix/hnix-store/compare/master...squalus:hnix-store:addtostorenar?expand=1

The initial command part is mostly there. The streaming isn't there yet. There's a loop that needs to happen where nix-daemon returns Read wantedBytes, and the client is supposed to send up to wantedBytes at a time, until the nar is fully sent. Right now it fails when Read is first returned.

sorki commented 8 months ago

The current addToStore in Remote.hs uses the AddToStore (opnum 7) operation, not AddToStoreNar (opnum 39). Unless I'm reading something wrong.

https://github.com/haskell-nix/hnix-store/blob/d8e337513f66735dae3048f1ca2f0cf0af7eeb67/hnix-store-remote/src/System/Nix/Store/Remote.hs#L89

https://github.com/haskell-nix/hnix-store/blob/d8e337513f66735dae3048f1ca2f0cf0af7eeb67/hnix-store-remote/src/System/Nix/Store/Remote/Protocol.hs#L95

You're right! I've confused this with AddToStore vs AddTextToStore

I have some partial work and a failing test here: https://github.com/haskell-nix/hnix-store/compare/master...squalus:hnix-store:addtostorenar?expand=1

The initial command part is mostly there. The streaming isn't there yet. There's a loop that needs to happen where nix-daemon returns Read wantedBytes, and the client is supposed to send up to wantedBytes at a time, until the nar is fully sent. Right now it fails when Read is first returned.

Neat! Read is there but wasn't ever used - I'll keep this in mind, it's another thing that should be abstracted away to allow e.g. conduit or pipes or whatnot. Kind-of related to #52 and #63

sorki commented 7 months ago

I think I've unblocked this, see https://github.com/haskell-nix/hnix-store/commit/57cc9e360934301a4f9ac9f0ff7d6d920f3d2cce

It is implemented in similar manner as NarSource for add to store - https://github.com/haskell-nix/hnix-store/blob/master/hnix-store-remote/src/System/Nix/Store/Remote/Client/Core.hs#L66-L77 as these are just a couple of special cases.

Implementing the ops is a bit different now (but not too different!) - you need to implement both sides so QuickCheck prop can check it (also don't forget to add it to the Arbitrary StoreRequest instance).

Feel free to ask if anything is not clear, I can write a longer "guide" iff needed.

sorki commented 7 months ago

It also must not read the entire NAR into memory, since the NARs can be quite large.

The problem here seems to be that the op requires FramedSink (as of protocol version >= 1.23), in remote we claim 24 now. The FramedSink simply sends the size of the contents ahead of time so the FramedSource on the other side knows how many bytes are left to read. But that means fully dumping the NAR somewhere to get its size, preventing streaming it like we do now. Same problem with addToStore for >= 1.25. The newer addToStore is the only thing blocking the supported version bump so I'll take a look soon^tm.

We could possibly be smart about this and do in-memory if dumping small stuff and go thru files for big NARs.

squalus commented 7 months ago

For AddToStoreNar, we already need the full data size in the form of the Metadata's narBytes field.

FramedSource itself doesn't need the full data size up front. It looks like it repeatedly reads (chunk size, bytes) pairs. The chunk size is chosen by the client. I made a PR with a passing test that uses FramedSource.

sorki commented 7 months ago

For AddToStoreNar, we already need the full data size in the form of the Metadata's narBytes field.

FramedSource itself doesn't need the full data size up front. It looks like it repeatedly reads (chunk size, bytes) pairs. The chunk size is chosen by the client. I made a PR with a passing test that uses FramedSource.

Cool! I've read the C++ framing thing wrong and @Ericson2314 corrected me - it would be a poor framing otherwise!