ruben2020 / codequery

A code-understanding, code-browsing or code-search tool. This is a tool to index, then query or search C, C++, Java, Python, Ruby, Go and Javascript source code. It builds upon the databases of cscope and ctags, and provides a nice GUI tool.
https://ruben2020.github.io/codequery/
Mozilla Public License 2.0
680 stars 86 forks source link

[Enhancement] Import\configure project through compile_commands.json file #89

Open pidgeon777 opened 3 years ago

pidgeon777 commented 3 years ago

It would be great to be able to import in codequery all of the source files and project configuration through an already existing compile_commands.json file, which is already the core file of code indexers such as clang or ccls and thus widely used in the most common projects.

In fact, all of the necessary files, compilation commands and flags needed to fully parse, index and analyse a project, is described there.

An option in the GUI or command line could thus be provided so that a project is loaded by specifying the compile_commands.json file for that project, also shared by clang and/or ccls.

ruben2020 commented 3 years ago

@pidgeon777 Currently CodeQuery is not using libclang, so it doesn't really need compilation commands and flags. cscope and ctags or similar tools only need a list of files. Nevertheless this can be considered a possible feature enhancement. I thought about writing a tool to use libclang as another frontend but I haven't got around to it yet. But I think it will be a lot slower than cscope and ctags. However, it will be more accurate.

pidgeon777 commented 3 years ago

This is an example of compile_commands.json file widely used by clangd and ccls:

[
{
  "directory": "C:/Work/Projects/Linux/Programs/Hello_World",
  "command": "C:/CrossComp/bin/aarch64-none-linux-gnu-gcc.exe -Wall -Wextra -I. -isystem C:/CrossComp/aarch64-none-linux-gnu/libc/usr/include -O3 -DNDEBUG -std=c99 -o main.o -c main.c",
  "file": "C:/Work/Projects/Linux/Programs/Hello_World/main.c"
},
{
  "directory": "XXX",
  "command": "YYY",
  "file": "ZZZ"
},
...
]

The main key elements are:

To generate the cscope database, a cscope.files file is used, which list all of the C and H files to be parsed. Also, this could be used for generating a tags file.

This means that by parsing a compile_commands.json file, it could be possible to do something like this:

1) Scan all of the -Ixxx argument paths for header files, recursively add them to the list. 2) Scan all of the -isystemYYY argument paths for system headers files, recursively add them to the list (optional?). 3) Add all of the C sources specified by the "file" to the list. 4) Sort the list and remove duplicated elements.

By doing so, a cscope.files could be generated, like the following:

...
C:\CrossComp\aarch64-none-linux-gnu\libc\usr\include\stdio.h
C:\Work\Projects\Linux\Programs\Hello_World\include.h
C:\Work\Projects\Linux\Programs\Hello_World\main.c
...

and it would include exactly the same files parsed through the compile_commands.json file by clangd or ccls.

So, the resulting cscope/CodeQuery database would be the most complete and accurate one.

I thought about writing a tool to use libclang as another frontend but I haven't got around to it yet. But I think it will be a lot slower than cscope and ctags. However, it will be more accurate.

This would be great and I encourage you to do so. Even better if it could work as an LSP client, thus compatible with almost every programming language. LSP is the future of programming languages, tags based analysis is quick and fast but it is impossible to obtain the same level of accuracy and completeness provided by LSP-based solutions.