Open mkboudreau opened 4 years ago
When you do the search in the UI, do you have any projects selected ? (assuming you are running with projects enabled)
When you do the search in the UI, do you have any projects selected ? (assuming you are running with projects enabled)
The screenshot supplied the entire UI, there is nothing else on the screen. I do not see anything relating to "projects".
How do you run the indexer ?
The indexer runs as in a container and upon completion, it bounces the web app container. The containers point to the same data and source dirs.
opengrok-indexer \
-j /usr/bin/java \
-J=-Djava.util.logging.config.file=/var/opengrok/logging.properties \
-J=-Xms6g -J=-Xmx6g \
${JMX_OPTIONS} ${HEAPDUMP_OPTIONS} \
-a /opt/opengrok/lib/opengrok.jar \
-- \
--verbose \
--progress \
--assignTags \
--source /opengrok/sources \
--dataRoot /opengrok/data \
--renamedHistory on \
--memory 256 \
-i node_modules -i vendor -i *.dll -i *.so -i *.exe -i *.jar -i *.gz
Why do you think the indexer would influence a difference between the REST api and the web UI returning different results?
I was asking because of the projects. You're running the indexer with projects disabled and it might be relevant for root causing this issue.
ok, thank you for the clarification. please let me know if there is anything else you need from me.
@vladak any luck duplicating this issue?
Tried with simple project-less setup, could not reproduce it. Is there something special about those 125 search hits ?
I just reran the queries and I cannot see anything special. It appears to be a subset, but I see a variety of file types, directory structures, etc. They all seem valid, except for there being 125 instead of ~140,000. The indexer indexes a local directory of all our organization's git repos in the format /org/repo1
, /org/repo2
, and so on.
Other examples I've executed from the REST endpoint (i.e. /api/v1/search?full=somesearch
) are also being limited at 125.
I was asking because of the projects. You're running the indexer with projects disabled and it might be relevant for root causing this issue.
@vladak , you are correct: when projects are disabled there seems to be an erroneous "double paging" going on. SearchEngine
in project-less configuration filters records early even though /api/v1/search
also will try to do later. The number of results in the SearchEngine
paging is by default numHitsPerPage * cachePages
or 25 * 5 = 125.
The projects-enabled search by SearchEngine
however also seems undesirably expensive in that it manifests every document found even though /api/v1/search
will later filter to a page of (by default) 1000 results.
@vladak given the investigation @idodeclare has done, this feels like a legitimate issue. Do you agree? What are the next steps?
@vladak any update on this issue?
sorry, no bandwidth to work on this currently.
The projects-enabled search by
SearchEngine
however also seems undesirably expensive in that it manifests every document found even though/api/v1/search
will later filter to a page of (by default) 1000 results.
This could lead to #1806 I think.
Describe the bug It does not seem like the API and the UI are returning the same results. Maybe I'm missing something, but when I search for the exact same thing from the UI and the API I get 125 results from the API and 9,708 from the UI. Something seems not right.
To Reproduce Try
/search?...
and/api/v1/search?...
with all the same query parameters. In my test onlyfull=
was set.Expected behavior I expect the UI and the REST API to return the same results
Screenshots
URL:
http://local-opengrok/search?full=test&defs=&refs=&path=&type=
URL:
http://local-opengrok/api/v1/search?full=test&defs=&refs=&path=&type=