universal-ctags / ctags

A maintained ctags implementation
https://ctags.io
GNU General Public License v2.0
6.39k stars 618 forks source link

Vim: support Vim9 script #3913

Open k-takata opened 5 months ago

k-takata commented 5 months ago

It would be nice if u-ctags supports Vim9 scripts by default.

This comment might be helpful: https://groups.google.com/g/vim_use/c/hp4KeIsDlFU/m/TCGoX5KFAwAJ


(Added by @masatake for tracking TODO items)

3930

3951

masatake commented 5 months ago

If a .ctags file for Vim9 and some test cases are shown, I can help the author of .ctags integrate them into our source tree.

We must know the relation between the .ctags file and our parsers/vim.c.

Generally, people think .ctags is just a configuration file. So, they don't put the copyright and license notices on the file. But we need the notices for integration.

masatake commented 5 months ago

https://vim-jp.org/vimdoc-ja/vim9.html

masatake commented 5 months ago

I have read some .vim files in the vim source tree that use vim9script. I am convinced supporting vim9script should be done in vim.c.

masatake commented 3 months ago

https://github.com/girishji/scope.vim/blob/271ad6d1b76e04cfacf5b30f80103654a2cfce73/autoload/scope/task.vim#L3

class and export keywords are used.

https://github.com/girishji/scope.vim/blob/271ad6d1b76e04cfacf5b30f80103654a2cfce73/autoload/scope/fuzzy.vim#L4

import keyword is used.

terminatorul commented 1 month ago

Hello !

I am trying to write _mtable expressions to add better support for classes, enums, interfaces, and to distinguish member variables and methods from local variables and nested (function-local) functions ...

I can populate the "access:" and "file:" fields, to describe tags that are private, protected, public or file-scoped. Also I can populate local variables and nested functions (if enabled), much like the 'l' kind (local variables) in C and C++.

I have attached my regexp expressions (~/.ctags.d/vim9.ctags) and my sample input scripts: vim9.ctags.gz ProjectConfigApi.tar.gz

But ctags tells me the parser can no longer progress past the last newline in the input file. despite the regexp are still matching:

timothy@X399-AORUS-Gaming-7-Win:/projects/.vim> ctags -f - --format=2 --excmd=pattern --extras= --fields={line}ks{access}{signature}{implementation}{typeref} --kinds-vim= ProjectConfig/autoload/ProjectConfigApi_DependencyWalker.vim | gvim ctags: Warning: Forcefully advance the input pos because ctags: Warning: following conditions for entering infinite loop are satisfied: ctags: Warning: + matching the pattern succeeds, ctags: Warning: + the next table is not given, and ctags: Warning: + the input file pos doesn't advance. ctags: Warning: Language: Vim, input file: ProjectConfig/autoload/ProjectConfigApi_DependencyWalker.vim, pos: 9867 timothy@X399-AORUS-Gaming-7-Win:~/projects/.vim>

Is there something wrong with my regexps ? Or is this an issue in ctags with the regexps parser ?

Also my new function tags (generated with my regexps) now duplicate the functions already generated by the built-in parser. Looking at the source code I see the Vim built-in parser is written in C, so it is not a *.ctags file that I could paste into my own regexp list.

Can you add some long flag to the regexp to merge the new tag with the same tag generated by the built-in parser ? Or maybe a flag to completely shadow (remove) the built-in tags when the _mtable-regexp generates the same tag.

Somehow my regexp bellow at line 6, for nested global function: /\s*sdef\s+(g:[^(]+)[^\n]\s*(\r?\n|$)/\1/f/{tenter=functionbody}{scope=push} emits a member tag, with scope field "function:", filled with the enclosing function name. Even 'though there is no {scope=ref} for this regexp. Because the resulting tag is a global function in Vim.

So how do I prevent ctags from generating a member tag, and instead generate a global tag ?

--_mtable-regex-vim=toplevel/\sexport\s+def\s+((g:)?\w+)[^\n]\s(\r?\n|$)/\1/f/{tenter=functionbody}{scope=push} --_mtable-regex-vim=toplevel/\sdef\s+(g:\w+)[^\n]\s(\r?\n|$)/\1/f/{tenter=functionbody}{scope=push} --_mtable-regex-vim=toplevel/\sdef\s+(\w+)[^\n]\s*(\r?\n|$)/\1/f/{tenter=functionbody}{scope=push}{_field=file:}

--_mtable-regex-vim=functionbody/\send(def|f|function)\s(\r?\n|$)///{scope=pop}{tleave} --_mtable-regex-vim=functionbody/\sdef\s+(g:[^(]+)[^\n]\s(\r?\n|$)/\1/f/{tenter=functionbody}{scope=push} --_mtable-regex-vim=functionbody/\sdef\s+([^:(]+)[^\n]\s(\r?\n|$)/\1/d,nested/{tenter=functionbody}{scope=ref}{scope=push} --_mtable-regex-vim=functionbody/\sconst\s+(g:\w+)[^\n]\s(\r?\n|$)/\1/K,vim9const/ --_mtable-regex-vim=functionbody/\s(var|const|final)\s+(\w+)[^\n]\s(\r?\n|$)/\2/l,local/{scope=ref} --_mtable-regex-vim=functionbody/\s(g:\w+)[^\n]\s(\r?\n|$)/\1/g,vim9global/ --_mtable-regex-vim=functionbody/\s#[^\n](\r?\n|$)/// --_mtable-regex-vim=functionbody/[^\n](\r?\n|$)///

Resulting tag is:

g:ProjectConfig_Collect_Other_Properties ProjectConfig/autoload/ProjectConfigApi_DependencyWalker.vim /^$/;" f line:268 function:Default_Accessor

Latest ctags version I have from my distribution is 6.0.0:

timothy@X399-AORUS-Gaming-7-Win:/mnt/c/Users/adria_f5m2wqf/Projects/.vim/.ctags.d> ctags --version Universal Ctags 6.0.0, Copyright (C) 2015-2022 Universal Ctags Team Universal Ctags is derived from Exuberant Ctags. Exuberant Ctags 5.8, Copyright (C) 1996-2009 Darren Hiebert Compiled: Feb 2 2023, 10:39:04 URL: https://ctags.io/ Output version: 0.0 Optional compiled features: +wildcards, +regex, +iconv, +option-directory, +packcc, +optscript timothy@X399-AORUS-Gaming-7-Win:/mnt/c/Users/adria_f5m2wqf/Projects/.vim/.ctags.d>

masatake commented 1 month ago

The test input is quite helpful in driving me to improve the parser.

I'll look into this. Please wait for a while.

terminatorul commented 3 weeks ago

Updated version here if someone is still looking : https://gist.github.com/terminatorul/b563855cd6af9a1ed41f36743fe0e9d2

Can now parse both Vim9 script and legacy functions and variables. Still depends on the built-in parser to extract mappings and autocommands, but the other kinds from the built-in parser are disabled (functions, variables, constants).

Is there a way to distribute parser files to other people, like uploading them on a repository, that I could also search ? Like Vim used to have a repository of user scripts on its website...

Here is also a Vim syntax file for writing ctags mtable parsers (highlighting for default regular expressions (POSIX ERE) only): https://www.vim.org/scripts/script.php?script_id=6112 image

masatake commented 3 weeks ago

Is there a way to distribute parser files to other people, like uploading them on a repository, that I could also searc ? Like Vim used to have a repository of user scripts on its website...

I assumed all useful .ctags should be part of Universal Ctags. So I have not though much about the way for distributing .ctags separately.

I wonder what I should do. I would like to implement all you need by extending existing vim.c. However, I think I cannot find enough time in soon.

Do you think vim.c should support all you implemented in vim9.ctags? If yes, I want you to help me to extend vim.c. To extend vim.c, I need test cases. You may have test cases that you have used during developing vim9.ctags. I would like you to convert them to .b test cases and make a pull request. See https://docs.ctags.io/en/latest/testing-parser.html#gathering-test-cases-for-known-bugs if you are interested in improving ctags itself.

If no, you may want to have a way to turn off the built-in parser implemented in vim.c. So your vim9.ctags can run without affected by the built-in parser.

[yamato@dev64]~% ctags --help-full | grep pretend 
  --_pretend-<NEWLANG>=<OLDLANG>
       Make NEWLANG parser pretend OLDLANG parser in lang: field.

--_prented may be useful for the purpose. However, I don't test the option enough. So we may revise it.

I have no plan to prepare a official repository.

BTW, I recommend you to put Copyright notice at the beginning of vim9.ctags. So people can work on your effort if you choose an open source license.

terminatorul commented 3 weeks ago

Sorry I do not have test files, and I want to get back to focusing on my Vim plugin instead. I tested manually using the scripts attached previously, plus my local vim config script. And using Tagbar plugin in Vim, to run ctags and show results. I wish I knew about the Units test facility sooner.

Which is kind of a point. Most people writing a parser, get there because they are trying to solve something else using their language. They are not on their main task, so to speak. And then the regexps are difficult for parsing a language, often not enough. Optscript sounds great (I didn't use it) but PostScript is foreign to almost all developers, and not easy to jump into.

It looks to me users need support for writing a full recursive-descent parser, or a grammar, but as an extension script to ctags. Some script languages like Rakudo (perl 6) have built-in support for writing grammars. and I think external parser generators should be available for some other languages.

Any chance ctags would embed a language for something like this ?

Or could you add a more involved example in the manual, for adding a language using existing optscript ? But not relying so much on the regexps, more then extracting lines for example, or lexing tokens like strings and operators.

I think my regex parser in vim9.ctags should be implemented in vim.c if possible, but some issues need to be called out:

  1. Line continuation is missing, and users will have an unpleasant surprise to find it's not there. Not easy to get right, because the continuation character is now sometimes optional in Vim 9, depending where in the syntax I need to break a line. But very easy to implement in C, if the continuation character is present.
  2. The regex parser disables and replaces the most used built-in kinds:

    • D (def) instead of f (function)
    • g (global) instead of v (variable)
    • K (constant) instead of C

    This change should not be replicated in vim.c. But for variables, it would be helpful to introduce a new kind V, for the new lexicaly-scoped variables in Vim 9. And keep the existing kind (v) for the old explicitly-scoped variables (starting with g: for variables in global dictionary, with b: for buffer variables, s: for script variables, w: for window variables, and t: for tab variables). All of these are globals no matter where they show up, with s: variables being file-local. Explicitly-scoped local variables (starting with l:) should not be emitted by default.