universal-ctags / ctags

A maintained ctags implementation
https://ctags.io
GNU General Public License v2.0
6.46k stars 620 forks source link

Java: parsing stops after nested anonymous class #2432

Open idodeclare opened 4 years ago

idodeclare commented 4 years ago

( Thank you for contacting us.

If you are reporting an issue with the parsing output, please fill the following template. As your custom CTags configuration can affect results, please always use --options=NONE as the first option when running ctags.

Otherwise, delete the template and write your issue from scratch. Examples may help developers understanding your issue better.

Use GitHub web interface and markdown notation. Using mail results broken text rendering that makes the developers go crazy. )


The name of the parser: java.c

The command line you used to run ctags:

$ ctags \
    --options=NONE \
    --kinds-c=+l \
    --kinds-java=+l \
    --kinds-sql=+l \
    --kinds-Fortran=+L \
    --kinds-C++=+l \
    --file-scope=yes \
    -u \
    --fields=-af+iKnS \
    --excmd=pattern \
    --langdef=scala \
    --regex-scala=$'/^[[:space:]]*((abstract|final|sealed|implicit|lazy)[[:space:]]*)*(private|protected)?[[:space:]]*class[[:space:]]+([a-zA-Z0-9_]+)/\\4/c,classes/' \
    --regex-scala=$'/^[[:space:]]*((abstract|final|sealed|implicit|lazy)[[:space:]]*)*(private|protected)?[[:space:]]*object[[:space:]]+([a-zA-Z0-9_]+)/\\4/o,objects/' \
    --regex-scala=$'/^[[:space:]]*((abstract|final|sealed|implicit|lazy)[[:space:]]*)*(private|protected)?[[:space:]]*case class[[:space:]]+([a-zA-Z0-9_]+)/\\4/C,caseClasses/' \
    --regex-scala=$'/^[[:space:]]*((abstract|final|sealed|implicit|lazy)[[:space:]]*)*(private|protected)?[[:space:]]*case object[[:space:]]+([a-zA-Z0-9_]+)/\\4/O,caseObjects/' \
    --regex-scala=$'/^[[:space:]]*((abstract|final|sealed|implicit|lazy)[[:space:]]*)*(private|protected)?[[:space:]]*trait[[:space:]]+([a-zA-Z0-9_]+)/\\4/t,traits/' \
    --regex-scala=$'/^[[:space:]]*type[[:space:]]+([a-zA-Z0-9_]+)/\\1/T,types/' \
    --regex-scala=$'/^[[:space:]]*((abstract|final|sealed|implicit|lazy|private|protected)[[:space:]]*)*def[[:space:]]+([a-zA-Z0-9_]+)/\\3/m,methods/' \
    --regex-scala=$'/^[[:space:]]*((abstract|final|sealed|implicit|lazy)[[:space:]]*)*val[[:space:]]+([a-zA-Z0-9_]+)/\\3/l,constants/' \
    --regex-scala=$'/^[[:space:]]*((abstract|final|sealed|implicit|lazy)[[:space:]]*)*var[[:space:]]+([a-zA-Z0-9_]+)/\\3/v,variables/' \
    --regex-scala=$'/^[[:space:]]*package[[:space:]]+([a-zA-Z0-9_.]+)/\\1/p,packages/' \
    --langdef=haskell \
    --regex-haskell=$'/^[[:space:]]*class[[:space:]]+([a-zA-Z0-9_]+)/\\1/c,classes/' \
    --regex-haskell=$'/^[[:space:]]*data[[:space:]]+([a-zA-Z0-9_]+)/\\1/t,types/' \
    --regex-haskell=$'/^[[:space:]]*newtype[[:space:]]+([a-zA-Z0-9_]+)/\\1/t,types/' \
    --regex-haskell=$'/^[[:space:]]*type[[:space:]]+([a-zA-Z0-9_]+)/\\1/t,types/' \
    --regex-haskell=$'/^([a-zA-Z0-9_]+).*[[:space:]]+={1}[[:space:]]+/\\1/f,functions/' \
    --regex-haskell=$'/[[:space:]]+([a-zA-Z0-9_]+).*[[:space:]]+={1}[[:space:]]+/\\1/f,functions/' \
    --regex-haskell=$'/^(let|where)[[:space:]]+([a-zA-Z0-9_]+).*[[:space:]]+={1}[[:space:]]+/\\2/f,functions/' \
    --regex-haskell=$'/[[:space:]]+(let|where)[[:space:]]+([a-zA-Z0-9_]+).*[[:space:]]+={1}[[:space:]]+/\\2/f,functions/' \
    --regex-clojure=$'/\\([[:space:]]*create-ns[[:space:]]+([-[:alnum:]*+!_:\\/.?]+)/\\1/n,namespace/' \
    --regex-clojure=$'/\\([[:space:]]*def[[:space:]]+([-[:alnum:]*+!_:\\/.?]+)/\\1/d,definition/' \
    --regex-clojure=$'/\\([[:space:]]*defn-[[:space:]]+([-[:alnum:]*+!_:\\/.?]+)/\\1/p,privateFunction/' \
    --regex-clojure=$'/\\([[:space:]]*defmacro[[:space:]]+([-[:alnum:]*+!_:\\/.?]+)/\\1/m,macro/' \
    --regex-clojure=$'/\\([[:space:]]*definline[[:space:]]+([-[:alnum:]*+!_:\\/.?]+)/\\1/i,inline/' \
    --regex-clojure=$'/\\([[:space:]]*defmulti[[:space:]]+([-[:alnum:]*+!_:\\/.?]+)/\\1/a,multimethodDefinition/' \
    --regex-clojure=$'/\\([[:space:]]*defmethod[[:space:]]+([-[:alnum:]*+!_:\\/.?]+)/\\1/b,multimethodInstance/' \
    --regex-clojure=$'/\\([[:space:]]*defonce[[:space:]]+([-[:alnum:]*+!_:\\/.?]+)/\\1/c,definitionOnce/' \
    --regex-clojure=$'/\\([[:space:]]*defstruct[[:space:]]+([-[:alnum:]*+!_:\\/.?]+)/\\1/s,struct/' \
    --regex-clojure=$'/\\([[:space:]]*intern[[:space:]]+([-[:alnum:]*+!_:\\/.?]+)/\\1/v,intern/' \
    --langdef=kotlin \
    --regex-kotlin=$'/^[[:space:]]*((abstract|final|sealed|implicit|lazy)[[:space:]]*)*(private[^ ]*|protected)?[[:space:]]*class[[:space:]]+([[:alnum:]_:]+)/\\4/c,classes/' \
    --regex-kotlin=$'/^[[:space:]]*((abstract|final|sealed|implicit|lazy)[[:space:]]*)*(private[^ ]*|protected)?[[:space:]]*object[[:space:]]+([[:alnum:]_:]+)/\\4/o,objects/' \
    --regex-kotlin=$'/^[[:space:]]*((abstract|final|sealed|implicit|lazy)[[:space:]]*)*(private[^ ]*|protected)?[[:space:]]*((abstract|final|sealed|implicit|lazy)[[:space:]]*)*data class[[:space:]]+([[:alnum:]_:]+)/\\6/d,dataClasses/' \
    --regex-kotlin=$'/^[[:space:]]*((abstract|final|sealed|implicit|lazy)[[:space:]]*)*(private[^ ]*|protected)?[[:space:]]*interface[[:space:]]+([[:alnum:]_:]+)/\\4/i,interfaces/' \
    --regex-kotlin=$'/^[[:space:]]*type[[:space:]]+([[:alnum:]_:]+)/\\1/T,types/' \
    --regex-kotlin=$'/^[[:space:]]*((abstract|final|sealed|implicit|lazy|private[^ ]*(\\[[a-z]*\\])*|protected)[[:space:]]*)*fun[[:space:]]+([[:alnum:]_:]+)/\\4/m,methods/' \
    --regex-kotlin=$'/^[[:space:]]*((abstract|final|sealed|implicit|lazy|private[^ ]*|protected)[[:space:]]*)*val[[:space:]]+([[:alnum:]_:]+)/\\3/C,constants/' \
    --regex-kotlin=$'/^[[:space:]]*((abstract|final|sealed|implicit|lazy|private[^ ]*|protected)[[:space:]]*)*var[[:space:]]+([[:alnum:]_:]+)/\\3/v,variables/' \
    --regex-kotlin=$'/^[[:space:]]*package[[:space:]]+([[:alnum:]_.:]+)/\\1/p,packages/' \
    --regex-kotlin=$'/^[[:space:]]*import[[:space:]]+([[:alnum:]_.:]+)/\\1/I,imports/' \
    --langdef=swift \
    --regex-swift=$'/enum[[:space:]]+([^\\{\\}]+).*$/\\1/n,enum,enums/' \
    --regex-swift=$'/typealias[[:space:]]+([^:=]+).*$/\\1/t,typealias,typealiases/' \
    --regex-swift=$'/protocol[[:space:]]+([^:\\{]+).*$/\\1/p,protocol,protocols/' \
    --regex-swift=$'/struct[[:space:]]+([^:\\{]+).*$/\\1/s,struct,structs/' \
    --regex-swift=$'/class[[:space:]]+([^:\\{]+).*$/\\1/c,class,classes/' \
    --regex-swift=$'/func[[:space:]]+([^\\(\\)]+)\\([^\\(\\)]*\\)/\\1/f,function,functions/' \
    --regex-swift=$'/(var|let)[[:space:]]+([^:=]+).*$/\\2/v,variable,variables/' \
    --regex-swift=$'/^[[:space:]]*extension[[:space:]]+([^:\\{]+).*$/\\1/e,extension,extensions/' \
    --regex-rust=$'/^[[:space:]]*(pub[[:space:]]+)?(static|const)[[:space:]]+(mut[[:space:]]+)?([[:alnum:]_]+)/\\4/C,consts,staticConstants/' \
    --regex-rust=$'/^[[:space:]]*(pub[[:space:]]+)?(unsafe[[:space:]]+)?impl([[:space:]\n]*<[^>]*>)?[[:space:]]+(([[:alnum:]_:]+)[[:space:]]*(<[^>]*>)?[[:space:]]+(for)[[:space:]]+)?([[:alnum:]_]+)/\\5 \\7 \\8/I,impls,traitImplementations/' \
    --regex-rust=$'/^[[:space:]]*(pub[[:space:]]+)?(unsafe[[:space:]]+)?trait[[:space:]]+([[:alnum:]_]+)/\\3/r,traits,traits/' \
    --regex-rust=$'/^[[:space:]]*let[[:space:]]+(mut)?[[:space:]]+([[:alnum:]_]+)/\\2/V,variables/' \
    --regex-pascal=$'/([[:alnum:]_]+)[[:space:]]*=[[:space:]]*\\([[:space:]]*[[:alnum:]_][[:space:]]*\\)/\\1/t,Type/' \
    --regex-pascal=$'/([[:alnum:]_]+)[[:space:]]*=[[:space:]]*class[[:space:]]*[^;]*$/\\1/c,Class/' \
    --regex-pascal=$'/([[:alnum:]_]+)[[:space:]]*=[[:space:]]*interface[[:space:]]*[^;]*$/\\1/i,interface/' \
    --regex-pascal=$'/^constructor[[:space:]]+(T[a-zA-Z0-9_]+(<[a-zA-Z0-9_, ]+>)?\\.)([a-zA-Z0-9_<>, ]+)(.*)+/\\1\\3/n,Constructor/' \
    --regex-pascal=$'/^destructor[[:space:]]+(T[a-zA-Z0-9_]+(<[a-zA-Z0-9_, ]+>)?\\.)([a-zA-Z0-9_<>, ]+)(.*)+/\\1\\3/d,Destructor/' \
    --regex-pascal=$'/^(procedure)[[:space:]]+T[a-zA-Z0-9_<>, ]+\\.([a-zA-Z0-9_<>, ]+)(.*)/\\2/p,procedure/' \
    --regex-pascal=$'/^(function)[[:space:]]+T[a-zA-Z0-9_<>, ]+\\.([a-zA-Z0-9_<>, ]+)(.*)/\\2/f,function/' \
    --regex-pascal=$'/^[[:space:]]*property[[:space:]]+([a-zA-Z0-9_<>, ]+)[[:space:]]*\\:(.*)/\\1/o,property/' \
    --regex-pascal=$'/^(uses|interface|implementation)$/\\1/s,Section/' \
    --regex-pascal=$'/^unit[[:space:]]+([a-zA-Z0-9_<>, ]+)[;(]/\\1/u,unit/' \
    --regex-powershell=$'/\\$(\\{[^}]+\\})/\\1/v,variable/' \
    --regex-powershell=$'/\\$([[:alnum:]_]+([:.][[:alnum:]_]+)*)/\\1/v,variable/' \
    --regex-powershell=$'/^[[:space:]]*(:[^[:space:]]+)/\\1/l,label/' \
    --_fielddef-powershell=$'signature,signatures' \
    --fields-powershell=$'+{signature}' \
    --regex-powershell=$'/`\\$([[:alnum:]_]+([:.][[:alnum:]_]+)*)/\\1//{exclusive}' \
    --regex-powershell=$'/`\\$(\\{[^}]+\\})/\\1//{exclusive}' \
    --regex-powershell=$'/#.*\\$([[:alnum:]_]+([:.][[:alnum:]_]+)*)/\\1//{exclusive}' \
    --regex-powershell=$'/#.*\\$(\\{[^}]+\\})/\\1//{exclusive}' \
    --regex-powershell=$'/^[[:space:]]*(function|filter)[[:space:]]+([^({[:space:]]+)[[:space:]]*(\\(([^)]+)\\))?/\\2/f,function,functions/{icase}{exclusive}{_field=signature:(\\4)}' \
    --langmap=Ada:+.ada.Ada.ADA \
    --langmap=Ada:+.adb.Adb.ADB \
    --langmap=Ada:+.ads.Ads.ADS \
    --langmap=Java:+.aidl.Aidl.AIDL \
    --langmap=XML:+.asax.Asax.ASAX \
    --langmap=XML:+.ascx.Ascx.ASCX \
    --langmap=Asm:+.asm.Asm.ASM \
    --langmap=XML:+.aspx.Aspx.ASPX \
    --langmap=Sh:+.awk.Awk.AWK \
    --langmap=Sh:+.bash.Bash.BASH \
    --langmap=C:+.c.C \
    --langmap=C++:+.c++.C++.C++ \
    --langmap=C++:+.cc.Cc.CC \
    --langmap=clojure:+.clj.Clj.CLJ \
    --langmap=clojure:+.cljs.Cljs.CLJS \
    --langmap=clojure:+.cljx.Cljx.CLJX \
    --langmap=Sh:+.com.Com.COM \
    --langmap=Sh:+.conf.Conf.CONF \
    --langmap=C++:+.cpp.Cpp.CPP \
    --langmap=$'C#:+.cs.Cs.CS' \
    --langmap=Sh:+.csh.Csh.CSH \
    --langmap=C++:+.cxx.Cxx.CXX \
    --langmap=C:+.d.D \
    --langmap=Eiffel:+.e.E \
    --langmap=Lisp:+.el.El.EL \
    --langmap=Erlang:+.erl.Erl.ERL \
    --langmap=Erlang:+.escript.Escript.ESCRIPT \
    --langmap=Tcl:+.exp.Exp.EXP \
    --langmap=Fortran:+.f.F \
    --langmap=Fortran:+.f03.F03.F03 \
    --langmap=Fortran:+.f08.F08.F08 \
    --langmap=Fortran:+.f15.F15.F15 \
    --langmap=Fortran:+.f90.F90.F90 \
    --langmap=Fortran:+.f95.F95.F95 \
    --langmap=Sh:+.flg.Flg.FLG \
    --langmap=Sh:+.gmk.Gmk.GMK \
    --langmap=Go:+.go.Go.GO \
    --langmap=C:+.h.H \
    --langmap=C++:+.hh.Hh.HH \
    --langmap=C++:+.hpp.Hpp.HPP \
    --langmap=Erlang:+.hrl.Hrl.HRL \
    --langmap=haskell:+.hs.Hs.HS \
    --langmap=haskell:+.hsc.Hsc.HSC \
    --langmap=XML:+.htm.Htm.HTM \
    --langmap=XML:+.html.Html.HTML \
    --langmap=C++:+.hxx.Hxx.HXX \
    --langmap=C:+.i.I \
    --langmap=Tcl:+.itcl.Itcl.ITCL \
    --langmap=Tcl:+.itk.Itk.ITK \
    --langmap=Java:+.jav.Jav.JAV \
    --langmap=Java:+.java.Java.JAVA \
    --langmap=JavaScript:+.js.Js.JS \
    --langmap=JSON:+.json.Json.JSON \
    --langmap=Sh:+.ksh.Ksh.KSH \
    --langmap=Sh:+.kshlib.Kshlib.KSHLIB \
    --langmap=kotlin:+.kt.Kt.KT \
    --langmap=kotlin:+.kts.Kts.KTS \
    --langmap=C:+.l.L \
    --langmap=C:+.lex.Lex.LEX \
    --langmap=Lisp:+.lisp.Lisp.LISP \
    --langmap=Lisp:+.lsp.Lsp.LSP \
    --langmap=Lua:+.lua.Lua.LUA \
    --langmap=XML:+.master.Master.MASTER \
    --langmap=Sh:+.mk.Mk.MK \
    --langmap=Sh:+.p5.P5.P5 \
    --langmap=pascal:+.pas.Pas.PAS \
    --langmap=SQL:+.pck.Pck.PCK \
    --langmap=Perl:+.perl.Perl.PERL \
    --langmap=Perl:+.ph.Ph.PH \
    --langmap=PHP:+.php.Php.PHP \
    --langmap=PHP:+.php3.Php3.PHP3 \
    --langmap=PHP:+.php4.Php4.PHP4 \
    --langmap=PHP:+.phps.Phps.PHPS \
    --langmap=PHP:+.phtml.Phtml.PHTML \
    --langmap=SQL:+.pkb.Pkb.PKB \
    --langmap=SQL:+.pks.Pks.PKS \
    --langmap=Perl:+.pl.Pl.PL \
    --langmap=SQL:+.plb.Plb.PLB \
    --langmap=SQL:+.pld.Pld.PLD \
    --langmap=SQL:+.pls.Pls.PLS \
    --langmap=Perl:+.plx.Plx.PLX \
    --langmap=Perl:+.pm.Pm.PM \
    --langmap=powershell:+.ps1.Ps1.PS1 \
    --langmap=powershell:+.psm1.Psm1.PSM1 \
    --langmap=Python:+.py.Py.PY \
    --langmap=Ruby:+.rb.Rb.RB \
    --langmap=rust:+.rs.Rs.RS \
    --langmap=Ruby:+.ruby.Ruby.RUBY \
    --langmap=Asm:+.s.S \
    --langmap=scala:+.scala.Scala.SCALA \
    --langmap=Lisp:+.scm.Scm.SCM \
    --langmap=Sh:+.sh.Sh.SH \
    --langmap=Sh:+.spec.Spec.SPEC \
    --langmap=SQL:+.sql.Sql.SQL \
    --langmap=SystemVerilog:+.sv.Sv.SV \
    --langmap=SystemVerilog:+.svh.Svh.SVH \
    --langmap=swift:+.swift.Swift.SWIFT \
    --langmap=C++:+.tcc.Tcc.TCC \
    --langmap=Tcl:+.tcl.Tcl.TCL \
    --langmap=Tcl:+.tclx.Tclx.TCLX \
    --langmap=Tcl:+.tk.Tk.TK \
    --langmap=Tcl:+.tm.Tm.TM \
    --langmap=C++:+.txx.Txx.TXX \
    --langmap=SystemVerilog:+.v.V \
    --langmap=SystemVerilog:+.vh.Vh.VH \
    --langmap=Tcl:+.wish.Wish.WISH \
    --langmap=C:+.x.X \
    --langmap=XML:+.xaml.Xaml.XAML \
    --langmap=Sh:+.xcl.Xcl.XCL \
    --langmap=XML:+.xml.Xml.XML \
    --langmap=C:+.xs.Xs.XS \
    --langmap=C:+.y.Y \
    --langmap=C:+.yacc.Yacc.YACC \
    --langmap=$'Sh:+([mM][aA][kK][eE][fF][iI][lL][eE]*)' \
    "$@"

The content of input file:

package foo.bar;

import java.lang.Runnable;
import java.util.concurrent.Executors;
import java.util.concurrent.ThreadFactory;

public final class Foo {
    public int getValue1() {
        return 1;
    }

    private Object newObject() {
        return Executors.newFixedThreadPool(
                2,
                new ThreadFactory() {
                @Override
                public Thread newThread(Runnable runnable) {
                    Thread thread = Executors.defaultThreadFactory().newThread(runnable);
                    thread.setName("demo-" + thread.getId());
                    return thread;
                }
            });
    }

    public int getValue2() {
        return 2;
    }
}

The tags output you are not satisfied with:

!_TAG_FILE_FORMAT   2   /extended format; --format=1 will not append ;" to lines/
!_TAG_FILE_SORTED   0   /0=unsorted, 1=sorted, 2=foldcase/
!_TAG_PROGRAM_AUTHOR    Universal Ctags Team    //
!_TAG_PROGRAM_NAME  Universal Ctags /Derived from Exuberant Ctags/
!_TAG_PROGRAM_URL   https://ctags.io/   /official site/
!_TAG_PROGRAM_VERSION   0.0.0   /fd99ef5a/
!_TAG_OUTPUT_MODE   u-ctags /u-ctags or e-ctags/
!_TAG_OUTPUT_FILESEP    slash   /slash or backslash/
foo.bar s.java  /^package foo.bar;$/;"  package line:1
Foo s.java  /^public final class Foo {$/;"  class   line:7
getValue1   s.java  /^    public int getValue1() {$/;"  method  line:8  class:Foo   signature:()
newObject   s.java  /^    private Object newObject() {$/;"  method  line:12 class:Foo   signature:()

The tags output you expect:

!_TAG_FILE_FORMAT   2   /extended format; --format=1 will not append ;" to lines/
!_TAG_FILE_SORTED   0   /0=unsorted, 1=sorted, 2=foldcase/
!_TAG_PROGRAM_AUTHOR    Universal Ctags Team    //
!_TAG_PROGRAM_NAME  Universal Ctags /Derived from Exuberant Ctags/
!_TAG_PROGRAM_URL   https://ctags.io/   /official site/
!_TAG_PROGRAM_VERSION   0.0.0   /fd99ef5a/
!_TAG_OUTPUT_MODE   u-ctags /u-ctags or e-ctags/
!_TAG_OUTPUT_FILESEP    slash   /slash or backslash/
foo.bar s.java  /^package foo.bar;$/;"  package line:1
Foo s.java  /^public final class Foo {$/;"  class   line:7
getValue1   s.java  /^    public int getValue1() {$/;"  method  line:8  class:Foo   signature:()
newObject   s.java  /^    private Object newObject() {$/;"  method  line:12 class:Foo   signature:()
newThread   s.java  /^                public Thread newThread(Runnable runnable) {$/;"  method  line:17 class:[ThreadFactory >> anonymous]  signature:(Runnable runnable)
getValue2   s.java  /^    public int getValue2() {$/;"  method  line:25 class:Foo   signature:()

The version of ctags:

$ ctags --version
Universal Ctags 0.0.0(fd99ef5a), Copyright (C) 2015 Universal Ctags Team
Universal Ctags is derived from Exuberant Ctags.
Exuberant Ctags 5.8, Copyright (C) 1996-2009 Darren Hiebert
  Compiled: Feb 19 2020, 22:14:56
  URL: https://ctags.io/
  Optional compiled features: +wildcards, +regex, +iconv, +option-directory, +xpath, +case-insensitive-filenames, +packcc

How do you get ctags binary:

( binary built from Universal-ctags/homebrew-universal-ctags project )

masatake commented 4 years ago

Maybe duplication of #1739.

idodeclare commented 4 years ago

Not quite. I’m emphasizing the “after” anonymous class. E.g. tags are only produced to line 153 of a 2008 line file, https://github.com/oracle/opengrok/blob/master/opengrok-indexer/src/main/java/org/opengrok/indexer/configuration/RuntimeEnvironment.java