NixOS / nix

Nix, the purely functional package manager
https://nixos.org/
GNU Lesser General Public License v2.1
12.53k stars 1.5k forks source link

Support for Git LFS in private repositories #4623

Open FPtje opened 3 years ago

FPtje commented 3 years ago

In Nixpkgs PRs https://github.com/NixOS/nixpkgs/pull/105998 and https://github.com/NixOS/nixpkgs/pull/113580, support for git LFS is added to the Nixpkgs fetchgit function. The problem with fetchgit, however, is that it does not properly support private repositories. Nix' builtins.fetchGit does support private repositories, but it does not seem to support git LFS.

Currently, when trying to builtins.fetchGit a repository with LFS, the following happens:

nix-repl> builtins.fetchGit {url = "git@github.com:my_company/private-lfs-repo.git"; rev = "some_rev";}
Downloading some/lfs/file (123 KB)
Error downloading object: some/lfs/file (a123456): Smudge error: Error downloading some/lfs/file (some_rev): batch request: missing protocol: ""

Errors logged to /home/my-user/nix/gitv2/xxx/lfs/logs/20210309T095658.11111111.log
Use `git lfs logs last` to view the log.
error: external filter 'git-lfs filter-process' failed
error: program 'git' failed with exit code 128

Ideally, it should be possible to builtins.fetchGit the repo either with or without downloading the LFS files. In one use case, the LFS files are used for non-vital things, like tests or documentation. The nix derivations do not depend on those files. Not downloading the LFS files would save space. In another use case, the LFS files are needed to build the derivations, and should therefore be downloaded.

It is possible to export GIT_LFS_SKIP_SMUDGE=1 to accomplish the first use case (i.e. fetch private LFS repository without actually downloading the LFS files), but it would be be much nicer to have it as an option of the builtins.fetchGit function.

roberth commented 3 years ago

4635 has the potential to fix the first use case by default

the LFS files are used for non-vital things, like tests or documentation.

Did you configure LFS globally in your git user config? I now realize git global user config may affect more places than what I've found with my testing.

FPtje commented 3 years ago

Did you configure LFS globally in your git user config?

Yes, the following section is present in ~/.gitconfig:

[filter "lfs"]
        clean = git-lfs clean -- %f
        smudge = git-lfs smudge -- %f
        process = git-lfs filter-process
        required = true
stale[bot] commented 3 years ago

I marked this as stale due to inactivity. → More info

arximboldi commented 3 years ago

For some projects I am working on LFS is crucial. I hope this gets solved soon.

stale[bot] commented 2 years ago

I marked this as stale due to inactivity. → More info

arximboldi commented 2 years ago

Still relevant.

reivilibre commented 2 years ago

This error also pops up when using a git repository that uses LFS as a flake input, or seemingly even just by having a flake in a repository with LFS (c.f. https://github.com/NixOS/nixpkgs/issues/137998). I didn't expect it, but export GIT_LFS_SKIP_SMUDGE=1 seems to also workaround the problem with flakes, as long as you don't care about the LFS files.

silky commented 1 year ago

but .... what if you do care about the LFS files .... 🥲 🥲 🥲 🥲 🥲 🥲 🥲

roberth commented 1 year ago

I think the plan for this would be

janvogt commented 11 months ago

Gitlab forces free users now to use LFS in many cases, so I guess this will become a lot more relevant.

SomeoneSerge commented 9 months ago

AFAIU, this is unspecific to private repos:

builtins.fetchGit {                  
  url = "https://huggingface.co/openlm-research/open_llama_3b";                                                                                                                     
  rev = "141067009124b9c0aea62c76b3eb952174864057";            
};                                                             

...fails in the same way:

...
Downloading pytorch_model.bin (6.9 GB)
Error downloading object: pytorch_model.bin (9ffd42d): Smudge error: Error downloading pytorch_model.bin (9ffd42dc58c4f49154e98bc7796306fde40febef278e99636a240a731d626a4a): batch request: missing protocol: ""

Errors logged to '/home/.../.cache/nix/gitv3/14avjqj1kcsaj6025lqgbr5r4yz680zmj1xzppc13cgxx12i8dj3/lfs/logs/20231227T021723.995860432.log'.
Use `git lfs logs last` to view the log.
error: external filter 'git-lfs filter-process' failed
fatal: pytorch_model.bin: smudge filter lfs failed
error:
       … while calling the 'fetchGit' builtin
...
newAM commented 9 months ago

@SomeoneSerge for huggingface this worked for me:

fetchgit {  # from `pkgs`, not `builtins`, may not matter?
  url = "https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2";
  rev = "b70aa86578567ba3301b21c8a27bea4e8f6d6d61";
  hash = "sha256-IAe/tHFB7yqFRF5aRojkNCD8TbKj8XQMt6eEyPmr4HU=";
  fetchLFS = true;
}
nixos-discourse commented 7 months ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/flake-lfs-input/40184/2

bratorange commented 2 months ago

Is there currently a workaround for fetching nix flakes input with lfs?

tengkuizdihar commented 3 weeks ago

@bratorange I think the only way is to create a tar.gz that includes all of the LFS files

lriesebos commented 2 weeks ago

@bratorange I think the only way is to create a tar.gz that includes all of the LFS files

@tengkuizdihar do you know if there is a way to directly do that through a github/gitlab link? also, I assume that does not allow things like ssh authentication in case of private repos.

tengkuizdihar commented 2 weeks ago

nope no idea

roberth commented 2 weeks ago

If you are fetching another repo, you could use fetchgit from Nixpkgs, which produces fixed output derivations. (This means adding a hash attribute, and accepting Import From Derivation if you need expressions from there or builtins.readFile etc)

For local LFS files in flakes, the only option is for libfetchers to support it. @b-camacho has a WIP PR; maybe he could use some help: