ropensci / git2rdata

An R package for storing and retrieving data.frames in git repositories.
https://ropensci.github.io/git2rdata/
GNU General Public License v3.0
99 stars 13 forks source link

Git LFS #46

Closed jmarandet closed 4 years ago

jmarandet commented 5 years ago

Hi,

This is a great project, but I must admit that I didn't gave it a try because of the 100 Mib size limitation.

Do you think it could work with Git Large File Storage : https://git-lfs.github.com/ ?

ThierryO commented 5 years ago

Technically yes, write_vc() writes files and thus they can be stored using Git Large File Storage.

I'm not sure if the efficiency in terms of git history will hold for git LFS. Much will depend on how the versions of plain text files are stored within git LFS. If they are stored by row based diffs (as done in plain git), then the efficiency should hold. If each version is stored as a separate blob, then is will be less efficient. Maybe in such cases storing the dataframe as rds file might be more efficient.

I'm not a git LFS user myself, so this is not tested yet. If you are will to test this, please do and let us know whether git2rdata is efficient or not in combination with git LFS.