oracle / opengrok

OpenGrok is a fast and usable source code search and cross reference engine, written in Java
http://oracle.github.io/opengrok/
Other
4.29k stars 739 forks source link

do not put binary data to logs when processing annotations #4473

Closed vladak closed 8 months ago

vladak commented 8 months ago

When indexing a legacy Teamware repository for Solaris freeware which contains tarballs of various open source software while the indexer had annotation cache enabled, the indexer log contained lots of these log records:

2023-11-03 15:47:27.583+0000 SEVERE t76 SCCSRepositoryAnnotationParser.processStream: Error: did not find annotations in line 1: [GIF89a^P^@^P^@<EF><BF><BD>=^@6Vj6Vn:Zr>^v>bzBf~Bj<EF><BF><BD>Fn<EF><BF><BD>Fn<EF>
<BF><BD>Jr<EF><BF><BD>Vr<EF><BF><BD>Jv<EF><BF><BD>Zv<EF><BF><BD>Nz<EF><BF><BD>Nz<EF><BF><BD>Zz<EF><BF><BD>R~<EF><BF><BD>^~<EF><BF><BD>^~<EF><BF><BD>R<EF><BF><BD><EF><BF><BD>^<EF><BF><BD><EF><BF><BD>^<EF><BF><BD>
<EF><BF><BD>^<EF><BF><BD><EF><BF><BD>^<EF><BF><BD><EF><BF><BD>^<EF><BF><BD><EF><BF><BD>b<EF><BF><BD><EF><BF><BD>b<EF><BF><BD><EF><BF><BD>f<EF><BF><BD><EF><BF><BD>n<EF><BF><BD><EF><BF><BD>n<EF><BF><BD><EF><BF>
<BD>r<EF><BF><BD><EF><BF><BD>j<EF><BF><BD><EF><BF><BD>n<EF><BF><BD><EF><BF><BD>v<EF><BF><BD><EF><BF><BD>n<EF><BF><BD><EF><BF><BD>v<EF><BF><BD><EF><BF><BD>~<EF><BF><BD><EF><BF><BD>z<EF><BF><BD><EF><BF><BD><EF>
<BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><C2><AE><EF><BF><BD><C6><AE><EF><BF><BD><CA><B6><EF><BF><BD><D2><BA><EF><BF><BD>
<EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF>
<BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF>
<BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD><EF><BF><BD>
<EF><BF><BD><EF><BF><BD><EF><BF><BD>!<EF><BF><BD>^D^A]

tailing this log this corrupted the terminal (and screen label). Binary in the logs should be avoided. Perhaps the AnalyzerGuru should be used to detect the genre first and it should be reflected before creating such log records.

This happened for SCCSRepositoryAnnotationParser, however is likely true for other annotation parsers.