Closed willbreitkreutz closed 6 months ago
@mark-english I'm curious what kind of files those would be. Moving this to a separate issue for discussion.
@willbreitkreutz BLUF - I was able to handle the files during the migration. Usually by splitting them into multiple smaller files, then removing the original large file from history using 'git filter-branch'.
The large files are not application code. They are files related to user data.
The ORM database is used to manage a lot of spatial data. One function that the ORM team provides users is integrating spatial data gathered outside of the ORM application into the database. Users provide data in spreadsheets, CSV, zip files, etc. and as a routine we keep track of what is provided, the database process used to enter/modify the data, and the end result of that process. In some cases the spatial data provided can be rather large and/or the resulting log files generated from the database process can be rather large. During the migration I came across 5 of these cases.
git-lfs plugin takes care of this (and github supports it). there is a git lfs migrate command that can patch up history as well.
@rthadr so, git-lfs does work, but we should talk, I didn't realize you guys were already using it. The org has a fairly small cap for large file storage over which it gets billed on top of the included costs. Right now we're already 3x the cap...
I just want to make sure we're not using GitHub for archiving or backing up data, focusing on using it for code. For data we should be putting it in S3 / Glacier for long term archive.
If we have larger binary files that do make sense to include in repositories we just need to make sure we identify those needs and track them since there are cost implications.
@mark-english meant to reply sooner, thanks for the update, glad you were able to get around the issue.
Originally posted by @mark-english in https://github.com/cwbi-apps/access-request/issues/22#issuecomment-2088463329