insight-lane / crash-model

Build a crash prediction modeling application that leverages multiple data sources to generate a set of dynamic predictions we can use to identify potential trouble spots and direct timely safety interventions.
https://insightlane.org
MIT License
112 stars 40 forks source link

Resolve GitLFS breakages #209

Closed terryf82 closed 5 years ago

terryf82 commented 5 years ago

@j-t-t @shreyapandit @alicefeng

The symptom of this problem is that git reports the city-specific archives in data_zips have changed when trying to swap between branches, even though no changes have happened in those files.

This thread is a couple of years old now, but it seems to pretty closely describe the issues we're having. Suggested remediation when you hit this issue is:

  1. Confirm both branches recognise the files as being LFS using git ls-tree

  2. Diff the files to see if they're recognised as binary.

  3. If they are binary, run git lfs install (which looks to be different to actually "installing" git lfs, I'm pretty sure I fell into this trap on my work computer). Confirm it's installed by checking the git config as described.

If you get through all that and are still experiencing issues let me know. Thanks.

terryf82 commented 5 years ago

It's possible that we may need to resort to using git lfs migrate to fix commits that have already been pushed, but I'd like to hear if the above ideas help others first (they haven't worked fully for me).

j-t-t commented 5 years ago

Following those steps enables me to switch branches without encountering the error I had before. However, it still shows that I have modified those files (regardless of what branch I'm on), and when I try to do a git checkout on them, I get Encountered 3 file(s) that should have been pointers, but weren't, which is a message I have seen before...

terryf82 commented 5 years ago

I can't say for certain but it sounds like at some point the zips have been committed without LFS being used. The way to fix this looks to be using the migrate subcommand I linked to above, here's the gist of it:

git lfs migrate import will repair your repository by migrating any blobs that should be stored with LFS to be so. You can push this to a remote via git push --all origin, or git push --all --force origin if you migrated commits that are already present on the remote.

Force push has always made me a little nervous, but I expect it's what we need to do in this case. Unless we decide now that git LFS isn't working for this project and we want to look at other options. What do you think @j-t-t ?

terryf82 commented 5 years ago

Files managed by GitLFS have been removed from the repo and will no longer be used to manage data, for now we're reverting to using data.world.

@j-t-t @bpben @alicefeng