Closed jetm closed 5 years ago
Firstly, it's weird that the indexer did not detect the newly added files. Without more detail it is hard to tell what happened.
As for the repository synchronization, it depends on the diversity of the repositories. For the basic use case it is sufficient to run opengrok-mirror
and plain reindex.
The opengrok-reindex-project
script is meant to be run from opengrok-sync
.
Sorry for lack of more details.
In the setup I am testing, I have mounted as a volume the indexed data: --volume /opengrok-data:/data
because I don't want to lose that data if the PC is rebooted and it's difficult to maintain a container with GBs of indexed data.
I have disabled the internal indexer cycle (REINDEX=0) and it's triggered out site with a cron job; as the documentation says docker exec <CONTAINER_ID> /scripts/index.sh
.
In the generated logs I don't see any errors and the indexer always finished successfully. In the xref I can see the newly added files, but I am unable to found them looking in the path field.
Let me know if I can provide more information.
After some testing, I found removing the symlinks inside the container it fixed this issue. The change looks like this:
diff --git a/Dockerfile b/Dockerfile
index 903da1c..1ff37a7 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -22,12 +22,9 @@ MAINTAINER OpenGrok developers "opengrok-dev@yahoogroups.com"
#PREPARING OPENGROK BINARIES AND FOLDERS
COPY --from=fetcher opengrok.tar.gz /opengrok.tar.gz
-RUN mkdir /opengrok && tar -zxvf /opengrok.tar.gz -C /opengrok --strip-components 1 && rm -f /opengrok.tar.gz && \
- mkdir /src && \
- mkdir /data && \
- mkdir -p /var/opengrok/etc/ && \
- ln -s /data /var/opengrok && \
- ln -s /src /var/opengrok/src
+RUN mkdir -p /opengrok /var/opengrok/etc /data /src && \
+ tar -zxvf /opengrok.tar.gz -C /opengrok --strip-components 1 && \
+ rm -f /opengrok.tar.gz
It makes data and src as directories, instead of symlinks. Are you interesed in this change? I can make a PR.
In essential this is how I mount the container for the code:
docker run \
--detach \
--env REINDEX=0 \
--volume /project:/src \
--volume /indexed_opengrok_data:/data \
--publish 8080:8080
It is surprising that the symlinks were the cause.
Personally, I'd like everything under /opengrok
in the container, including the configuration. Feel free to go ahead with the PR, it will be step in the right direction.
Given that the repository synchronization scripts are part of the OpenGrok distribution, it should not be hard to run opengrok-mirror
(or convert the indexer()
shell function to use opengrok-sync
which will run the opengrok-mirror
) as part of the process.
fixed in d9af82a77dc44ea446223f99947d154941a5c0f3
Hi,
Reading Repository synchronization at Opengrok wiki, I wonder if in the Opengrok docker image is supported
opengrok-sync
to sync and index a Git code repository or runopengrok-reindex-project
after a Git pull.I did some tests and after I updated a Git repository and ran the script indexer (index.sh), the newly added files are not able to be found by searching in path field. I had to remove all the indexed data and run again the indexer (this way is very slow because the codebase weight 10 GB and the indexer takes 3h to finish).
Should I run
opengrok-reindex-project
instead ofindex.sh
?