universal-ctags / ctags

A maintained ctags implementation
https://ctags.io
GNU General Public License v2.0
6.57k stars 628 forks source link

Ansible Support #2370

Open fourjay opened 4 years ago

fourjay commented 4 years ago

This more properly belongs in a discussion forum, but as near as I can tell, github issues is the de-facto discussion forum (I apologize if there is a better place).

I'm working on adding better (read "link traversing") ansible support. The primary use case, vars: and set_fact: should end up targets when the var is referenced later.

So far I've worked out:

# keywords to ignore
--regex-ansible=/^\s*(become|set_fact|args|shell|remote|repo|warn|debug|include_tasks):.*///{placeholder}{exclusive}
--regex-ansible=/^\s*(register|creates|dest|mode|port|owner|state|force):.*///{placeholder}{exclusive}
--regex-ansible=/^\s*(become|set_fact|args|shell|remote|repo|warn|debug|include_tasks):.*///{placeholder}{exclusive}
--regex-ansible=/^\s*(register|creates|dest|mode|port|owner|state|force):.*///{placeholder}{exclusive}
# "empty" tag (typically a larger vertical task keyword
--regex-ansible=/^\s*[a-zA-x]+:$///{placeholder}{exclusive}

# match all keys that were not ignored previously
--regex-ansible=/^\s*([a-zA-Z_]+):.*/\1/v,var/

Basically: 1) a longish list of keywords to ignore. 2) a generic rule to capture the key part of yaml key: val statement.

This works (coupled with an iniconf rule to trim off :children from a section head). When I find a variable, I can jump to a definition (my goal) Are there recommendations for how to better handle this. Among other things the list of keywords is only my common keywords, and is quite incomplete.

I did try using the new mtables regex support, but ran into the issue of not being able to terminate a table/section. My instincts were to match an empty line, but (as I understand it) there's no way to match EOL in mtable.

FWIW, part of my reason for posting this ticket is to offer google a more useful answer for searches for ctags ansible then the answers aimed at generating a useful table of contents for Tagbar or similar vim utility, which are completely useless for tag linking

masatake commented 4 years ago

This more properly belongs in a discussion forum, but as near as I can tell, github issues is the de-facto discussion forum (I apologize if there is a better place).

This is the place for the purpose. See the README.md of this project:

The goal of the project is preparing and maintaining common/unified working space where people interested in making ctags better can work together.

Could you show us a small input, expected tags output, and comments for them?

fourjay commented 4 years ago

Thanks :-)

---
  # enable mod_deflate

  - hosts: some_hosts
    vars:
        a_variable: its_value

    become: true

    tasks:
      - name: common tasks
        include_tasks: '{{ base_include_path }}tasks/web/common.yml'

      - name: set local variable
        set_fact:
            a_local_var: its_valuec
  ...

The goal (realized, albeit not ideally) is tags generated for a_variable and a_local_var. The var base_include_path should have a link elsewhere (in this case it is in a "group_var" elsewhere) i.e. the other end of the link chain. I want to be able to tag jump to where base_include_path is defined.

The approach I outlined initially works, but is a brute force approach. It tags everything matching word_colon and avoids overkill by adding a set of short circuit exits on the most common (in my use case) ansible keywords. The full keyword set, by this point, is easily over 1000 (and I'd guess closer to 4000). If this approach was fully realized, there'd be an almost mini parser in the rule set, as it would account for appropriate values, as some keywords accept only a limited range of values. Another breakage point, it's acceptable to use a keyword as a variable, even if a bad idea, and this approach will not work there.

The _mtable approach seems correct here. There are only a few blocks where variable declaration is legal. My approach for closing such a section was the empty line, but that does not seem to work (from what I got out of reading, $ is a file terminator in this context). That choice of block end is not idea, as it's a style convention on my part, and its syntactically correct to leave out the space.

FWIW, indentation seems an appropriate scope indicator here. I'd imagine that it would be useful in other whitespace sensitive languages. YAML seems like kind of a mess for this sort of parsing, as it's limited required "operators" (indent, colon, dash, brackets) are not semantically useful. Indentation seems the most useful.

masatake commented 4 years ago

Universal-ctags has a built-in ansible parser based on yaml parser. The yaml parser is based on libyaml. So it is reliable. However, you may not be satisfied with the quality of the ansible parser.

When I wrote the ansible parser, I cannot convince myself what its users may want because I wan not familiar with Ansible.

The current implementation can do only as following

[yamato@slave]~/var/ctags-github% cat Units/parser-ansibleplaybook.r/play-name.d/input.yml
cat Units/parser-ansibleplaybook.r/play-name.d/input.yml
- name: Update ctags
  yum: pkg=ctags latest
- name: Update etags
  yum: pkg=etags latest
[yamato@slave]~/var/ctags-github% u-ctags -o - Units/parser-ansibleplaybook.r/play-name.d/input.yml
u-ctags -o - Units/parser-ansibleplaybook.r/play-name.d/input.yml
Update ctags    Units/parser-ansibleplaybook.r/play-name.d/input.yml    /^- name: Update ctags$/;"  p   language:AnsiblePlaybook
Update etags    Units/parser-ansibleplaybook.r/play-name.d/input.yml    /^- name: Update etags$/;"  p   language:AnsiblePlaybook

Using --regex-ansible for solving this issue it not good idea. What we have to do is improving ansibleplaybook.c.

This is my first response.

fourjay commented 4 years ago

Can I ask if my example a) makes sense (or needs more clarification) and b) is helpful?

To be clear, the prior web searches I've done for ansible and ctags are almost all about leveraging ctags in a way tangential to the normal purpose of aiding search for definitions. Instead the advice I've found as all about signalling local file structure (AKA "tasks by names" the ansible "name:" field is entirely descriptive, and is extremely unlikely to ever be a target to a link. The intended consumer for this sort of tag file is a vim plugin "Tagbar" which displays a (partial) semantic structure of the file. The "name:" field is used as the easiest source of descriptive text. This is not (to my mind) the intended purpose of a tag file, which seems to me to be about linking var/function use to target definitions.

ansible has a notion of variable (and definition). There are three areas I can see to define variables:

1) as value targets of the register: key. This captures output and allows querying output later on in the file.

2) as (explicit) vars: section of a file (or "playbook" in ansible speak). This is a set of key: val pairs where the key portion is referenced later (very often in a completely different file). This section is marked by indentation, and is terminated by another ansible block marking keyword. become: is a common example where overall file section does not change. The overall section can be terminated with the tasks: keyword.

3) the task section can contain a local vars definition, defined by the set_fact: key. This starts a block of key: val as a locally defined variable, in a set_fact: file section. This block is terminated by either the next task or another modifier section to the task where set_fact: is scoped.

In YAML fashion, a key:val pattern can be defined as an inline key=val, but current best practice tends to use vertical style key: val

A "play" (complete ansible entity) consists of a hosts: section (which can contain a number of "locally global" subsections, and and tasks: section, consisting of an array of tasks, which require a module, but typically start with a name: descriptive value. The two sections do not have to exist in the same file.

Ansible has a library structure references as a "role", which is usually defined in a directory "roles" which can be a subdirectory in the working directory or elsewhere. This more or less mirrors the structure of the top level ansible structure, but is designed to encourage modularity, both in terms of task isolation, but in subdividing the "role" into it's own coherent subsections hidden from the calling routine. A role will often accept variable parameters, and those vars are defined mostly as above.