Smattr / clink

a modern re-implementation of Cscope
The Unlicense
43 stars 2 forks source link

Feature request: ending line number or byte range intervals #253

Closed eaubin closed 1 month ago

eaubin commented 2 months ago

Would it be possible to store the the end point or length of a definition in terms of lines or bytes? I'd like to extract the only relevant text without parsing the file.

Smattr commented 2 months ago

Thanks for your interest in Clink!

I guess you’re looking to consume the Clink database with some other tool? Clink itself doesn’t have any way of looking at the ending point of a symbol. Not sure what you mean by extracting the text without parsing the file. It’s not possible to find the end of a symbol without parsing it.

Clang APIs certainly allow retrieving this information, so should be pretty straightforward for me to add it to the database if that’s what you need.

eaubin commented 1 month ago

Yeah, I'm querying the sqlite file and then read the source file, split into lines, start at the given line and try to parse just enough of the code to grab the whole text of the definition. If I had the filename and start end bytes, I could just e.g. use dd to grab the text of the definition.

Smattr commented 1 month ago

Ah, so end location would be good, but byte offsets for both start and end would be even better. I think both should be doable. Stay tuned…

Smattr commented 1 month ago

In implementing this I’ve stumbled across an interesting quirk of libclang I was not aware of. When walking the C AST, your current “cursor” has a number of different locations you can retrieve:

a declaration
  ┌───┴──┐
  int foo;
  ▲   ▲ ▲▲
  │   │ │└ end of cursor range
  │   │ └ end of current token
  │   ├ cursor location
  │   └ also start of current token
  └ start of cursor range

The surprising thing to me here is that the “location” of the cursor is not the start of its range. I.e. clang_getCursorLocation(mycursor) != clang_getRangeStart(clang_getCursorExtent(mycursor)). This is sort of convenient for Clink, as it actually wants the symbol (foo) start not the semantic entity start (the containing declaration).

But this prompts the question of what you want here. Are you after the start and end of the symbol itself? Or the start and end of the whole semantic entity?

eaubin commented 1 month ago

Thanks for taking this on, the whole declaration/body is what I'm looking for

Smattr commented 1 month ago

Do you want to take a look at https://github.com/Smattr/clink/pull/255 and see if it meets your needs?

eaubin commented 1 month ago

The schema looks great, but from some reason start_line, start_col, start_byte, end_line, end_col, end_byte are all 0 in the symbols table (if e.g. I run it on the clink codebase). The unit tests all pass though.

Sorry, doing a clean build seems to have fixed whatever problem I had, looks great!

eaubin commented 1 month ago

I think the difficulty was not providing --compile-commands dir. If I just run build/clink/clink -b in the root dir and query

SELECT records.path,SUM(symbols.start_byte) FROM symbols JOIN records ON records.id=symbols.path GROUP BY records.path;

I can see the clink/src dir is indexed, but not libclink which I was using for testing.

|                          path                          | SUM(symbols.start_byte) |
|--------------------------------------------------------|-------------------------|
| build/CMakeFiles/3.25.2/CompilerIdC/CMakeCCompilerId.c | 0                       |
| build/CMakeFiles/_CMakeLTOTest-C/src/foo.c             | 0                       |
| build/CMakeFiles/_CMakeLTOTest-C/src/main.c            | 0                       |
| build/clink/manpage.c                                  | 0                       |
| build/libclink/schema.c                                | 0                       |
| build/libclink/version.c                               | 0                       |
| build/libclink/vimcat/libvimcat/version.c              | 0                       |
| clink/src/build.c                                      | 4923920                 |
| clink/src/colour.c                                     | 20417                   |
| clink/src/compile_commands_close.c                     | 4836                    |
| clink/src/compile_commands_find.c                      | 2320273                 |
| clink/src/compile_commands_open.c                      | 39216                   |
| clink/src/cwd.c                                        | 2183                    |
| clink/src/dirname.c                                    | 36348                   |
| clink/src/disppath.c                                   | 6894                    |
| clink/src/file_queue.c                                 | 362614                  |
| clink/src/find_me.c                                    | 185745                  |
| clink/src/find_repl.c                                  | 16439                   |
| clink/src/have_vim.c                                   | 524                     |
| clink/src/help.c                                       | 173067                  |
| clink/src/highlight.c                                  | 292785                  |
| clink/src/is_root.c                                    | 7376                    |
| clink/src/join.c                                       | 29292                   |
| clink/src/main.c                                       | 9257929                 |
| clink/src/option.c                                     | 1413636                 |
| clink/src/path.c                                       | 301202                  |
| clink/src/progress.c                                   | 1053074                 |
| clink/src/re.c                                         | 4514                    |
| clink/src/screen.c                                     | 2787974                 |
| clink/src/set.c                                        | 322956                  |
| clink/src/sigint.c                                     | 16681                   |
| clink/src/spinner.c                                    | 277030                  |
| clink/src/str_queue.c                                  | 0                       |
| clink/src/ui.c                                         | 22099693                |
| common/compiler.h                                      | 0                       |
| common/ctype.h                                         | 0                       |
| common/pipe.h                                          | 0                       |
| libclink/include/clink/asm.h                           | 0                       |
| libclink/include/clink/c.h                             | 0                       |
| libclink/include/clink/clang.h                         | 0                       |
| libclink/include/clink/clink.h                         | 0                       |
| libclink/include/clink/cscope.h                        | 0                       |
| libclink/include/clink/db.h                            | 0                       |
| libclink/include/clink/debug.h                         | 0                       |
| libclink/include/clink/def.h                           | 0                       |
| libclink/include/clink/generic.h                       | 0                       |
| libclink/include/clink/iter.h                          | 0                       |
| libclink/include/clink/python.h                        | 0                       |
| libclink/include/clink/symbol.h                        | 0                       |
| libclink/include/clink/tablegen.h                      | 0                       |
| libclink/include/clink/version.h                       | 0                       |
| libclink/include/clink/vim.h                           | 0                       |
| libclink/src/add_line.h                                | 0                       |
| libclink/src/add_symbol.h                              | 0                       |
...

running with the --compile-commands everything in libclink, tests, common etc is indexed.

Smattr commented 1 month ago

Ah yes, if locating compile_commands.json fails Clink falls back to using Cscope. You can inspect more about what’s going on by passing --jobs=1 --debug but output will be very verbose.

Glad it works for you!