Closed gzaripov closed 2 years ago
Hello @gzaripov,
I'm not sure I fully understand your use case: If you have already checked all the files into git, then why don't you use git to get all necessary information?
Git already stores everything in a tree structure, and parsing the results from there should be a better idea than reading the dir content recursively in node.js and then retrieving each hash from git separately.
So the starting point would be according to the documentation
$ git cat-file -p main^{tree}
[...]
100644 blob d630eaf136b09ad2f0acb23c74acde2866433fba CHANGELOG.md
100644 blob ac123679c1a6a1b984df89c0b163155c0733a678 README.md
040000 tree 16511c817968e5e04dda7aa9d6e8cef9b3d07323 bin
040000 tree dd350d245efb452615e2463bf6e1ae18cb8640f8 docs
[...]
And then parse the lines and recursively get information from the other tree
s.
If you are not limited to git, you might want to try fossil scm (the tool written to develop sqlite), which is supposed to have better APIs and might make querying easier.
I'm closing this issue because I still think it would be best to avoid duplicate work and not only retrieve hashes from git, but rather query the whole tree structure.
I found that hashing all my repository files takes approx. two minutes. I think I can make it much faster if I use hash that git has already computed using
git hash-object
. But now I can't do that because lib doesn't provide option for that.Please add option for custom hash functions