Closed alexrkopp-xx closed 7 years ago
First, I'd like to thank you for the work you have put into expanding codesearch.
Thanks!
One question, though: What's the reason for skipping the entire file if a long line is encountered, instead of just ignoring the line?
This is the behavior from the original codesearch. I just made it configurable. It is used as one indicator that it is most likely not a text file. From the original comment in the code:
// A file is assumed not to be text files (and thus not indexed)
// if it contains an invalid UTF-8 sequences, if it is longer than maxFileLength
// bytes, if it contains a line longer than maxLineLen bytes,
// or if it contains more than maxTextTrigrams distinct trigrams.
First, I'd like to thank you for the work you have put into expanding codesearch.
One question, though: What's the reason for skipping the entire file if a long line is encountered, instead of just ignoring the line?