OpenGrok / docker

WARNING: this repository is archived !
58 stars 31 forks source link

how to index the git data ? #22

Closed kkumarma closed 5 years ago

kkumarma commented 5 years ago

I know few ways to get git repo data.

1)to write a script to clone all the repo's to /src. 2)use repo tool to repo sync with git repo's.

we can't map to git server for git folder to index. as there will be bare data.

Can you please guide me how to index the git repo's data to opengrok ?(in large data)

vladak commented 5 years ago

What do you mean by "bare data" ?

kkumarma commented 5 years ago

bare data is stored on server, which is not actual files of git repo. the data that is stored in packages inside server.

kkumarma commented 5 years ago

we have a 200GB data in git, where we don't like to clone all repo's every 30 min. we want opengrok should talk directly to git for data.

vladak commented 5 years ago

That's not possible. OpenGrok itself is not capable of mirroring+indexing, let alone browsing the contents. For efficient syncing there are opengrok-tools, see https://github.com/oracle/opengrok/wiki/Repository-synchronization

With the tools updating git clone is incremental as is the indexing, you can mirror+index repositories in parallel etc. There are shops with much larger code base than yours and run just fine with clones. Eventually, the indexer has to get the data from somewhere. So, local space saving might be trumped by overall indexer slowness since it would have to pull the contents of files over the network. There is always some trade off I think.