oracle / opengrok

OpenGrok is a fast and usable source code search and cross reference engine, written in Java
http://oracle.github.io/opengrok/
Other
4.36k stars 748 forks source link

Matlab analyzer wanted #3419

Open Ymoise opened 3 years ago

Ymoise commented 3 years ago

I'm seeing a lot of information about how to add analyzers, but if there any way to add filter types to the search?

I don't need the files to be treated differently once I find them, I just need to be able to search for them.

Is this covered somewhere? If not, could it be?

vladak commented 3 years ago

You mean in the UI ? There is the file type picker. It is currently limited to single file type only - #2580. Using the API search endpoint you can filter with multiple file types I believe.

Ymoise commented 3 years ago

No, I didn't mean that.

What I meant was that the list itself doesn't include some file types, so you can't filter for them when you search, for example .mat extensions.

My users have been asking for them and I wanted to know if there's a way to add them.

vladak commented 3 years ago

The file types correspond to the built in analyzers. Each analyzer can have a set of prefixes/suffixes/filenames to match against (next to other things like first couple of significant bytes in a file). Assuming .mat is Matlab file that would mean to add a new Matlab analyzer.

vladak commented 3 years ago

To search for a distinct suffix, one uses the (quirky) syntax (token distance) in the File Path, e.g. ". c"~1 however this does not work in the usual sense as it matches tokenized path, so in this case you'd get matches like foo.c but also bar.c.i.

vladak commented 3 years ago

I don't think we want to expose all the individual path matching components for all analyzers in the UI as that would be a bit overwhelming.

vladak commented 3 years ago

Matlab analyzer is a good idea I think.

Ymoise commented 3 years ago

And once you add it, it'll be in the webapp. Great.

Sorry, but I've been asked to ask if you have any idea when that will be, if I may.

vladak commented 3 years ago

No plan on my side at the moment. One can add pretty rudimentary Matlab analyzer for starters and improve it incrementally. The main thing is to codify the basic grammar and keywords in JFlex.

Ymoise commented 3 years ago

No plan on your side suggests I can change it on mine.

In which case, I think I must have misunderstood you completely... I understood that the UI is affected by the default analyzers, which means that adding one, via a read-only config, wouldn't do it.

Did I misunderstand you?

vladak commented 3 years ago

Your understanding is correct. The type picker in the search form is populated by https://github.com/oracle/opengrok/blob/876391fae90ef7239a29b98c01205b3ab9a89fcc/opengrok-web/src/main/webapp/menu.jspf#L200-L209 which takes them from AnalyzerGuru.getfileTypeDescriptions(). In AnalyzerGuru the set is populated with https://github.com/oracle/opengrok/blob/876391fae90ef7239a29b98c01205b3ab9a89fcc/opengrok-indexer/src/main/java/org/opengrok/indexer/analysis/AnalyzerGuru.java#L313-L317 and this is done only in the static code block (which is not good but that's another bug) so there is no way to influence it without adding new analyzer proper, i.e. by changing the source code.

vladak commented 3 years ago

For anyone willing to work on this here are some pointers to get started:

Universal ctags has support for Matlab already:

$ /usr/local/bin/ctags --list-languages | grep -i matlab
MatLab

so this should be just about adding the grammar and associated analyzer glue to OpenGrok.