purescript / registry-dev

Development work related to the PureScript Registry
https://github.com/purescript/registry
95 stars 80 forks source link

Add git-lfs dependency #651

Closed thomashoneyman closed 1 year ago

thomashoneyman commented 1 year ago

Some invocations of git clone will call out to git-lfs as seen in this failed workflow: https://github.com/purescript/registry/issues/341#issuecomment-1664683011

Adding git-lfs to our Nix environment resolves the issue: https://github.com/purescript/registry/issues/341#issuecomment-1665707661

f-f commented 1 year ago

I'm a little hesitant to add this blindly without understanding why that happens?

Would be unfortunate to accidentally lfs-pull gigabytes of content..

thomashoneyman commented 1 year ago

It's happening in dotenv's case because that repository has explicitly tracked images with lfs instead of including them in the normal repository data. They're not big files — just 3.5kb and 5.4kb:

https://github.com/nsaunders/purescript-dotenv/blob/master/meta/img/readme.png https://github.com/nsaunders/purescript-dotenv/blob/master/meta/img/social-media-preview.png

That said, since they're hosted on GitHub's LFS service, they've got to have git-lfs available to be pulled down. We ultimately will remove these files from the package tarball.

So I suppose the question is: do we allow you to use LFS in your repository? If not, how do we disable it but still fetch your code? Personally I think it's fine that we support it.

Note that even if you did try to store gigabytes of content in your source directory then your package will be rejected: https://github.com/purescript/registry-dev/blob/3166e0982d37d96cfc586fcd0791de937822d2c1/app/src/App/API.purs#L607

f-f commented 1 year ago

Note that even if you did try to store gigabytes of content in your source directory then your package will be rejected

I am not worried at all about the size of the uploads - we have indeed checks in place for that. What I am worried about is the registry server ending up with an accidental DoS because git-lfs automatically pulled down a ton of data and filled the disk. I'd be ok with including lfs in the build if we can make sure that it does not automatically pull down data.

thomashoneyman commented 1 year ago

Well, we can skip downloading any logs files with GIT_LFS_SKIP_SMUDGE=1 applied to the clone command. (Still need git-lfs installed.)

Once we know the include/exclude globs we could pull again with those globs —

git lfs pull --include "<globs>”

…which would bring in any lfs files in the upstream included by the globs.

For now I’m sure we could get away with just skipping lfs files, though.

f-f commented 1 year ago

Well, we can skip downloading any logs files with GIT_LFS_SKIP_SMUDGE=1 applied to the clone command

All good then, I'd say we can apply this envvar, and state somewhere that we do not support files that come with lfs (which seems entirely fair to me)

thomashoneyman commented 1 year ago

See: https://stackoverflow.com/questions/42019529/how-to-clone-pull-a-git-repository-ignoring-lfs

This is implemented in the latest commit.