Open wy193777 opened 5 years ago
Could you give an example ?
Below is an example from a instance run locally. You can see a lot of <b>
been added to wrap the search term. Logically, the client using the REST API should already know the searched term, so strong them isn't necessary. You can also see &
been substituted to &
.
{
"time": 68,
"resultCount": 4,
"startDocument": 0,
"endDocument": 3,
"results": {
"/Golden-Register-2.0-Backend/package.json": [
{
"line": " \"start\": \"copyfiles sql/**/*.sql build/ && tsc && node --<b>max-old-space-size</b>=5120 --trace-warnings -r ts-node/register build/src/index.js --pretty\",",
"lineNumber": "13"
}
],
"/diagram-visualization/node_modules/webpack/package.json": [
{
"line": " \"appveyor:test\": \"node node_modules\\\\mocha\\\\bin\\\\mocha --<b>max-old-space-size</b>=4096 --harmony test/*.test.js\",",
"lineNumber": "129"
},
{
"line": " \"benchmark\": \"mocha --<b>max-old-space-size</b>=4096 --harmony test/*.benchmark.js -R spec\",",
"lineNumber": "131"
},
{
"line": " \"circleci:test\": \"node node_modules/mocha/bin/mocha --<b>max-old-space-size</b>=4096 --harmony test/*.test.js\",",
"lineNumber": "134"
},
{
"line": " \"cover\": \"node --<b>max-old-space-size</b>=4096 --harmony ./node_modules/istanbul/lib/cli.js cover -x '**/*.runtime.js' node_modules/mocha/bin/_mocha -- test/*.test.js\",",
"lineNumber": "135"
},
{
"line": " \"cover:min\": \"node --<b>max-old-space-size</b>=4096 --harmony ./node_modules/istanbul/lib/cli.js cover -x '**/*.runtime.js' --report lcovonly node_modules/mocha/bin/_mocha -- test/*.test.js\",",
"lineNumber": "136"
},
{
"line": " \"test\": \"mocha test/*.test.js --<b>max-old-space-size</b>=4096 --harmony --check-leaks\",",
"lineNumber": "143"
}
],
"/Patching-Tool-Client/node_modules/webpack/package.json": [
{
"line": " \"benchmark\": \"node --<b>max-old-space-size</b>=4096 --trace-deprecation node_modules/jest-cli/bin/jest --testMatch \\\"<rootDir>/test/*.benchmark.js\\\" --runInBand\",",
"lineNumber": "200"
},
{
"line": " \"cover:all\": \"node --<b>max-old-space-size</b>=4096 node_modules/jest-cli/bin/jest --coverage\",",
"lineNumber": "204"
},
{
"line": " \"cover:integration\": \"node --<b>max-old-space-size</b>=4096 node_modules/jest-cli/bin/jest --testMatch \\\"<rootDir>/test/*.test.js\\\" --coverage\",",
"lineNumber": "206"
},
{
"line": " \"cover:unit\": \"node --<b>max-old-space-size</b>=4096 node_modules/jest-cli/bin/jest --testMatch \\\"<rootDir>/test/*.unittest.js\\\" --coverage\",",
"lineNumber": "208"
},
{
"line": " \"schema-lint\": \"node --<b>max-old-space-size</b>=4096 node_modules/jest-cli/bin/jest --testMatch \\\"<rootDir>/test/*.lint.js\\\" --no-verbose\",",
"lineNumber": "214"
},
{
"line": " \"test\": \"node --<b>max-old-space-size</b>=4096 --trace-deprecation node_modules/jest-cli/bin/jest\",",
"lineNumber": "218"
},
{
"line": " \"test:basic\": \"node --<b>max-old-space-size</b>=4096 --trace-deprecation node_modules/jest-cli/bin/jest --testMatch \\\"<rootDir>/test/{
TestCasesNormal,
StatsTestCases,
ConfigTestCases
}.test.js\\\"\",",
"lineNumber": "219"
},
{
"line": " \"test:integration\": \"node --<b>max-old-space-size</b>=4096 --trace-deprecation node_modules/jest-cli/bin/jest --testMatch \\\"<rootDir>/test/*.test.js\\\"\",",
"lineNumber": "220"
},
{
"line": " \"test:unit\": \"node --<b>max-old-space-size</b>=4096 --trace-deprecation node_modules/jest-cli/bin/jest --testMatch \\\"<rootDir>/test/*.unittest.js\\\"\",",
"lineNumber": "221"
}
],
"/Golden-Register-2.0-Frontend/node_modules/webpack/package.json": [
{
"line": " \"benchmark\": \"node --<b>max-old-space-size</b>=4096 --trace-deprecation node_modules/jest-cli/bin/jest --testMatch \\\"<rootDir>/test/*.benchmark.js\\\" --runInBand\",",
"lineNumber": "244"
},
{
"line": " \"cover:all\": \"node --<b>max-old-space-size</b>=4096 node_modules/jest-cli/bin/jest --coverage\",",
"lineNumber": "248"
},
{
"line": " \"cover:integration\": \"node --<b>max-old-space-size</b>=4096 node_modules/jest-cli/bin/jest --testMatch \\\"<rootDir>/test/*.test.js\\\" --coverage\",",
"lineNumber": "250"
},
{
"line": " \"cover:unit\": \"node --<b>max-old-space-size</b>=4096 node_modules/jest-cli/bin/jest --testMatch \\\"<rootDir>/test/*.unittest.js\\\" --coverage\",",
"lineNumber": "252"
},
{
"line": " \"schema-lint\": \"node --<b>max-old-space-size</b>=4096 node_modules/jest-cli/bin/jest --testMatch \\\"<rootDir>/test/*.lint.js\\\" --no-verbose\",",
"lineNumber": "258"
},
{
"line": " \"test\": \"node --<b>max-old-space-size</b>=4096 --trace-deprecation node_modules/jest-cli/bin/jest\",",
"lineNumber": "260"
},
{
"line": " \"test:basic\": \"node --<b>max-old-space-size</b>=4096 --trace-deprecation node_modules/jest-cli/bin/jest --testMatch \\\"<rootDir>/test/{
TestCasesNormal,
StatsTestCases,
ConfigTestCases
}.test.js\\\"\",",
"lineNumber": "261"
},
{
"line": " \"test:integration\": \"node --<b>max-old-space-size</b>=4096 --trace-deprecation node_modules/jest-cli/bin/jest --testMatch \\\"<rootDir>/test/*.test.js\\\"\",",
"lineNumber": "262"
},
{
"line": " \"test:unit\": \"node --<b>max-old-space-size</b>=4096 --trace-deprecation node_modules/jest-cli/bin/jest --testMatch \\\"<rootDir>/test/*.unittest.js\\\"\",",
"lineNumber": "263"
}
]
}
}
The OpenGrok's REST API doc also show this behavior.
{
"time": 13,
"resultCount": 35,
"startDocument": 0,
"endDocument": 0,
"results": {
"/opengrok/test/org/opensolaris/opengrok/history/hg-export-renamed.txt": [{
"line": "# User Vladimir <b>Kotal</b> <Vladimir.<b>Kotal</b>@oracle.com>",
"lineNumber": "19"
},{
"line": "# User Vladimir <b>Kotal</b> <Vladimir.<b>Kotal</b>@oracle.com>",
"lineNumber":"29"
}]
}
It seems that the summarizer/highlighter kicks in.
Yes, SearchController
uses SearchEngine.results()
which uses Summarizer
and Summary
used therein is inherently HTML based. There needs to be a special version of SearchEngine.results()
for the API.
Thanks for your quick help!
Have idea on when will this bug been fixed?
Not a priority, at least for me. Pull requests are welcome of course.
pá 11. 1. 2019 23:57 odesílatel Shenghan Gao notifications@github.com napsal:
Have idea on when will this bug been fixed?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/oracle/opengrok/issues/2612#issuecomment-453683616, or mute the thread https://github.com/notifications/unsubscribe-auth/ACzGDOhqD2nlackS_SLGQV16dsCMm2cSks5vCRbvgaJpZM4ZsfzS .
The <b>
actually not just comes from Summarizer
. After dig into source code, I find those html tags might also comes from Context.java
, which is extremely complex.....
OK, Context.java
calls code from PlainLineTokenizer.lex
, which is generated from opengrok-indexer/src/main/resources/search/context/PlainLineTokenizer.lex
. This lex file seems also did things like htmlize. I guess eliminate <b>
is not a simple task.
Yes, seems like some serious refactoring is needed.
Dne st 20. 3. 2019 1:20 uživatel Shenghan Gao notifications@github.com napsal:
OK, Context.java calls code from PlainLineTokenizer.lex, which is generated from opengrok-indexer/src/main/resources/search/context/PlainLineTokenizer.lex. This lex file seems also did things like htmlize. I guess eliminate is not a simple task.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/oracle/opengrok/issues/2612#issuecomment-474634688, or mute the thread https://github.com/notifications/unsubscribe-auth/ACzGDGGrMh50N6fbWcz0Ut8V9O4kop91ks5vYX69gaJpZM4ZsfzS .
It's fairly straight-forward to extend OGKUnifiedHighlighter
. I raised PR #2732.
The
/search
REST API return results with unnecessary HTML tags and formatting. Is there a way to turn off HTML formatting on search results from REST API?