Closed ajnavarro closed 5 years ago
Because of the way the commit graph needs to be traversed, we cannot possibly calculate the generation of a commit lazily as we compute the ref_name
table rows, since we need to go all the way to the roots and back to calculate the generation.
That leaves is with no option but to use the commit graph files in .git/objects/info
to read this information.
This, however, has another issue: the commit graph file may not be there if it has not been generated.
We would need to make sure every repository that's added to the repository pool has a commit graph file during initialisation of gitbase and generate them for repositories where they're not present. However, this may be tricky with siva files, since it implies we have to actually write data on the siva file (the commit graph file). This is a very important consideration we need to have into account, because to add this feature we need to ensure all possible git repositories we can add have the capability of generating the commit graph file.
Another very important consideration we need to take into account is the fact that repositories may change and commit graph file may become outdated. I'm not sure git updates that file (haven't been able to find it in the docs), but go-git and potentially other git clients may not update the file once the repository changes. Generating a new commit graph file each time gitbase is started may be fine?
Assuming there is no problem in a commit graph file accessible and we are able to provide this data, there is nothing very difficult here. Rows are generated per-partition, so we load the commit graph when the first row of the partition is requested and dispose it once we're finished.
@ajnavarro Shall I close this issue?
Right now, ref_commits has the following schema:
Check how feasible is to add a generation column that basically is giving the position of the commit in relation to the root commit, as graph index is doing:
https://github.com/git/git/blob/master/Documentation/technical/commit-graph-format.txt
The idea is to use that generation value to make possible get the new commits from previous queried data.