terascope / teraslice

Scalable data processing pipelines in JavaScript
https://terascope.github.io/teraslice/
Apache License 2.0
50 stars 13 forks source link

[Proposal] Migrate Binary Test Files to Git LFS #3827

Open sotojn opened 1 week ago

sotojn commented 1 week ago

Currently, teraslice contains a lot of test "fixtures" used for testing, stored directly in the repo. This increases our repository size and overall is inefficient. Also git’s native handling of binaries is inefficient, adding a buch overhead to each binary file update. The solution would be to move these binary files to Git Large File Storage (Git LFS).

We would all need to install Git LFS and would introduce docs for adding binary files to Git LFS rather than the repository, including steps to use Git LFS.

We would also need to modify something like our yarn setup to run commands like git lfs install and git lfs pull to ensure we always get the binaries we need for testing. This would be easy to introduce because we already run yarn setup before each test in ci.

Lastly, it would be a good idea to set up checks or pre-commit hooks to make sure future binary files are automatically added to Git LFS to prevent accidental addition to the main repo.

godber commented 1 week ago

Lastly, it would be a good idea to set up checks or pre-commit hooks to make sure future binary files are automatically added to Git LFS to prevent accidental addition to the main repo.

I think this makes this doable ... I hadn't pushed on using LFS because I didn't have a plan to make everyone comply.