universal-ctags / ctags

A maintained ctags implementation
https://ctags.io
GNU General Public License v2.0
6.48k stars 620 forks source link

[RFC] Which language do you want us to support? #1566

Closed masatake closed 3 years ago

masatake commented 6 years ago

This is /dev/null (The 2nd meaning of https://en.wiktionary.org/wiki//dev/null) where requests go. We never promise to write the parser you want. I will help you if you want to write a parser only if you will provide the result under 'GPLv2 or later.' I want to merge the result to our code base.

I would like you to write the document that explains the target language specification. Much sample input files or information for large code base may help developers.

You can use thumb-up emoji.

For summarizing purpose, I will remove your comment.

For me(@masatake), the most of the languages I will use daily C, Python, Shell, Make, Automake, Autoconf, Asm, RPMspec, and Ld-script are available, and their parsers work well though there are many points I have to improve.

Member in the organization, feel free updating this comment.

If you find an implementation including .ctags file for a language, you can report here as I did about asn1 parser. We can import the implementation if the author agrees distributing it in GPL v2 or later. For importing, we need volunteers who helps us write test cases.

Don't report about issue about a client tool like vim here. ctags just generates tag files. It doesn't provide tree structured UI panel (or something.) If you need a help from ctags side to improve your client tool, open an issue.

My queue:

masatake commented 6 years ago

In #2105, we will merge Haskell parser developed at Geany project.


Haskell

Just a question those who are put "thumb-up". hasktags is not enough for you?

(The comments you put in this PR will be mesmerized by me and will be deleted to keep this PR simple.)


Merged the parser at Geany project. See https://github.com/universal-ctags/ctags/pull/2712.

masatake commented 6 years ago

@ksamborski introduced TypeScript parser. See #2064 (a79a8cd3a852e1acf338e928c30a10e8004b8242).


Typescript

Apr 1, 2018. 10 people want though VScode may provide powerful feature utilizing lang protocol. Oct 20, 2018. 21 peope want.

https://github.com/Microsoft/TypeScript/blob/master/doc/spec.md

masatake commented 6 years ago

Improving clojure

masatake commented 6 years ago

Scala


Some hints are in #654.

masatake commented 6 years ago

Vala

I myself have no chance to read vala code but it is interesting target to research how to write a parser utiliing lex (and yacc) in ctags.


@masatake is working on this parser. https://github.com/universal-ctags/ctags/pull/2677

masatake commented 6 years ago

Improving Ruby parser


Plesae, assist me to improve Ruby parser. See #2000. Now many important aspects are improved though I don't say perfect.

masatake commented 6 years ago

lex & flex

masatake commented 6 years ago

I introduced @Roy-Orbison's scss parser. See #2191 (6adbc59).


scss

baruch commented 6 years ago

please add dlang, there is an old work for that at https://github.com/snosov1/ctags-d


@barunch, Universal-ctags has d parser already though I don't know its quality. The patch you introduced in better than one in Universal-ctags? If the answer is yes, or there are an area we have to discuss, please open an issue. (added by @masatake)

masatake commented 6 years ago

xsd

masatake commented 6 years ago

asn1

We can import the changes in https://github.com/grogers0/ctags . It uses setjmp/longjmp. We should not use the functions. Rewriting is needed.

2019-12-16: I got the permission to import the parser from the original author.

gwerbin commented 6 years ago

Markdown, reStructuredText, and AsciiDoc would be fantastic. They can be surprisingly quirky to parse with regular expressions, especially strict POSIX regular expressions.


Now, u-ctags has parsers for all of them. In some reasons (#1727), I wrote regex based one for MarkDown. (added by @masatake)

masatake commented 6 years ago

java10 (var keyword)

hadrielk commented 6 years ago

Asciidoc is now in universal-ctags, with PR #1838.

qingkong1998 commented 5 years ago

swift

(Originally two langauges swift and objectivec are listed here. @mastake deleted objectivec.) (Other client sides releated comments posted by @qingkong1998 are removed.)

stefantalpalaru commented 5 years ago

Nim

https://nim-lang.org/docs/manual.html (added by @masatake) https://github.com/topics/nim (for codebase added by @masatake)

After 3 PRs, it's still not supported: https://github.com/universal-ctags/ctags/pull/393, https://github.com/universal-ctags/ctags/pull/396 and https://github.com/universal-ctags/ctags/pull/401

vladak commented 5 years ago

Visual basic (see https://github.com/oracle/opengrok/issues/2619)

pylipp commented 5 years ago

QML

http://doc.qt.io/qt-5/qmlreference.html

bam80 commented 5 years ago

Shell with function calls detection

@masatake added:

I will not implement reference tags like function calls for awhile.

masatake commented 5 years ago

ObjectiveC++

masatake commented 5 years ago

TOML https://github.com/toml-lang/toml https://github.com/toml-lang/toml/blob/master/toml.abnf

vladak commented 5 years ago

Groovy (see https://github.com/oracle/opengrok/issues/2677)

gonsolo commented 5 years ago

Swift

I have a basic swift.ctags file:

--langdef=swift
--langmap=swift:+.swift

--kinddef-swift=v,variable,variables
--kinddef-swift=f,function,functions
--kinddef-swift=s,struct,structs
--kinddef-swift=c,class,classes
--kinddef-swift=p,protocol,protocols
--kinddef-swift=e,enum,enums
--kinddef-swift=t,typealias,typealiases

--regex-swift=/(var|let)[ \t]+([^:=]+).*$/\2/v/
--regex-swift=/func[ \t]+([^\(\)]+)\([^\(\)]*\)/\1/f/
--regex-swift=/struct[ \t]+([^:\{]+).*$/\1/s/
--regex-swift=/class[ \t]+([^:\{]+).*$/\1/c/
--regex-swift=/protocol[ \t]+([^:\{]+).*$/\1/p/
--regex-swift=/enum[ \t]+([^:\{]+).*$/\1/e/
--regex-swift=/(typealias)[ \t]+([^:=]+).*$/\2/v/

Has anybody an idea how to correctly add the "try" keyword?

craigbarnes commented 5 years ago

Is there really an advantage to having 100 parsers in a single code base? I seem to remember the Vala parser (from Anjuta) couldn't be merged here because it required libvala, which was deemed an unacceptable dependency. Using libvala is by far the cleanest approach -- it only becomes a problem because having 100 outside dependencies for 100 different parsers in a single, monolithic repo becomes unmanageable.

Why can't different parsers live in separate repos? For multi-language projects, it doesn't seem like much of a problem to run several tag generators and have them all append to the same tags file, coordinated by a script or build system. For example:

tags:
        ctags -o - src/*.[ch] > tags
        valatags -o - src/*.vala >> tags
masatake commented 5 years ago

@craigbarnes, Using other tag generators is a realistic idea for users. Universal-ctags itself doesn't forbid users to do so. It is impossible for us to forbid using the other tag generators. I never want to do so. People should use what you want though my recommendation is free software.

However, I insist on adding more parsers to Universal-ctags executable. As you wrote maintaining the code will be harder. However, Wireshark developers do well. Wireshark supports more than 1000 dissectors. We can learn the technique used in Wireshark.

What I want as a developer of u-ctags is not a multi-language tool. I want a cross-language tool.

[jet@living]~/var/ctags% cat /tmp/foo.html
cat /tmp/foo.html
<html>
  <h1>title</h1>
  <script>
    var f = function (n) {
    return n + 1;
    }
  </script>
</html>
[jet@living]~/var/ctags% ./ctags -o - --fields=+l --extras=+g /tmp/foo.html
./ctags -o - --fields=+l --extras=+g /tmp/foo.html
f   /tmp/foo.html   /^    var f = function (n) {$/;"    f   language:JavaScript
title   /tmp/foo.html   /^  <h1>title<\/h1>$/;" h   language:HTML

This is just a simple example. I think we can explore this idea much more. Supporting multi-language in a simple executable is base of exploring this idea.

If you volunteer, you can maintain the page "other tagging engine" in our Wiki site. I'm bad in English. So I have not edited the web page.

However, it looks that you are a good programmer; writing a parser for Vala from scratch will not be taken much time from you.

vbs100 commented 5 years ago

VS code extension,

currently the exists ctags for VS code is very slow, and not good as "C++ Intellisense" for global/gtags's .

global supported language is less than universal-ctags, so it will very powerful when universal-ctags support used in VS code.

masatake commented 5 years ago

VS code extension,

This is not a name of language. Vim, VS code, Atom, Emacs, etc. are client tools. This project is not for developing a client tool though we can give advices to developers of client tools. If a client tool you use is too slow, tell it to its developers, not us.

svermeulen commented 5 years ago

Moonscript

hiAndrewQuinn commented 5 years ago

Julia


Julia parser is now part of ctags. See https://github.com/universal-ctags/ctags/pull/2654. Appended by @masatake.

janEbert commented 5 years ago

Julia

Some progress is here, though it's error prone using only regexes: https://github.com/JuliaEditorSupport/julia-ctags/

ghost commented 5 years ago

forth language

lukamac commented 4 years ago

~SystemVerilog~


(@masatake edited) ctags already has a parser for it.

lukamac commented 4 years ago

@masatake my bad :tired_face:

0d1ndgod commented 4 years ago

vlang

ByLiZhao commented 4 years ago

~tex/latex~


(@masatake edited) ctags already has a parser for it.

zerubeus commented 4 years ago

tsx please

vladak commented 4 years ago

terraform (https://github.com/oracle/opengrok/pull/3108)

masatake commented 4 years ago

thrift

https://thrift.apache.org/docs/idl https://raw.githubusercontent.com/apache/thrift/master/test/ThriftTest.thrift

jajajasalu2 commented 4 years ago

ReasonML

wuseal commented 4 years ago

Kotlin, Please, Thank you


See #2769 (appended by @masatake).

marcastel commented 4 years ago

Kornshell (AST available)

marcastel commented 4 years ago

Pandoc (rather than simply Markdown... by using the Pandoc AST all tokens can be very simply extracted)

Which means that with one implementation you gain support for (see https://pandoc.org for full list):

Pandoc could either be an external binary dependency, or a shared Haskell library.

TrentSe commented 4 years ago

Prolog would be great. I'm happy to assist (though not my area of expertise...).

masatake commented 4 years ago

@TrentSe,

I'm happy to assist

There is only a few chance I can get this kind of offering. I wrote how to assist me in the initial comment of this issue.

I would like you to write the document that explains the target language specification. Much sample input files or information for large code base may help developers.

TrentSe commented 4 years ago

Hey Masatake,

You're most welcome. I'd be attempting it myself without your help, so makes sense for us to team up!

I'll read up and get back to you...

Where would you like me to put the input files / language spec...?

masatake commented 4 years ago

You're most welcome. I'd be attempting it myself without your help, so makes sense for us to team up! Great!

Where would you like me to put the input files / language spec...?

Please, open a new issue for developing our prolog parser.

TrentSe commented 4 years ago

Please, open a new issue for developing our prolog parser.

Thanks Masatake. Created as #2628.

Shane-XB-Qian commented 3 years ago

would you mind adding filter pattern for vim9 script?

// there would may have (export) 'class' also, but for now, those two.

// And actually/also :

var foo: dict<any> = {
  # something...
}
foo->extend({
  bar: function('s:bar', [foo])
})

then foo.bar() looks for now cannot find s:bar.

Shane-XB-Qian commented 3 years ago

vim9 means vim9 script, is a new version of vim script, compared to legacy one. // so perhaps it can be thought as a new language? or if (you) donot think so, it's ok then let me open a 'new issue'.

// pls ignore the naming in the example, it's not regarding to lsp, but just a example. which for now, looks not supported by ctags to parse for vim script language. // updated my comment as well, sorry if confused.

masatake commented 3 years ago

How different vim9 from vim8? script? If it is very different, someone has to write a new parser for it. If it is not , extending the parser for vim in ctags is enough for support vim9.