antlr / grammars-v4

Grammars written for ANTLR v4; expectation that the grammars are free of actions.
MIT License
10.19k stars 3.71k forks source link

Typescript example at http://lab.antlr.org/ is broken #3413

Open abulka opened 1 year ago

abulka commented 1 year ago

The Typescript example at http://lab.antlr.org/ is broken. There is no output at all when the Run button is pressed image

P.S. Locally, I've compiled the java files OK but not sure how to run the parser. grun Typescript program examples/Class.ts -tree gives me an error

Exception in thread "main" java.lang.NoClassDefFoundError: TypeScriptLexer (wrong name: TypescriptLexer)
        at java.base/java.lang.ClassLoader.defineClass1(Native Method)
        at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1013)
        at java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:150)
        at java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:862)
        at java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:760)
        at java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:681)
        at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:639)
        at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
        at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
        at org.antlr.v4.gui.TestRig.process(TestRig.java:129)
        at org.antlr.v4.gui.TestRig.main(TestRig.java:119)

P.P.S. I also tried mvn clean install in the grammars-v4/javascript/typescript directory which generated target/typescript-1.0-SNAPSHOT.jar but I'm not sure how to use that from the command line to parse a file.

kaby76 commented 1 year ago
abulka commented 1 year ago

I did happen to copy the files from Java/* before compiling and running, as I was aiming for a Java target.

cd grammars-v4/javascript/typescript
cp Java/* . 
antlr4  TypeScriptLexer.g4 TypeScriptParser.g4
javac *.java

Then running via grun (as shown above) gave me that error. These same steps worked for the CSharp grammar with Java target so was hoping the steps would work for Typescript too.

There doesn't seem to be a version of transformGrammar.py in the grammars-v4/javascript/typescript directory tree - though surely I don't need to run that for a pure Java target.

As for running trgen - that looks like a different project/installation and am not clear what the steps would be.

kaby76 commented 1 year ago

lab.antlr.org cannot work with the javascript/typescript grammar. It has actions.

But, the Java target works fine. You didn't spell the grammar name correctly. It's "TypeScript" not "Typescript".

cat ../examples/Function.ts | java -cp 'c:/Users/Kenne/.m2/antlr4-4.12.0-complete.jar;.' org.antlr.v4.gui.TestRig TypeScript program -gui

Screenshot (46)

abulka commented 1 year ago

Hmmm, TypeScript rather than Typescript doesn't work for me either:

$ cat examples/Function.ts | grun TypeScript program -gui
Can't load TypeScript as lexer or parser

$ grun TypeScript program examples/Class.ts -tree
Can't load TypeScript as lexer or parser

I have copied the Java/* files in OK and everything compiles OK. My aliases are set up as per https://github.com/antlr/antlr4/blob/master/doc/getting-started.md and work ok for the csharp grammar, but not for typescript.

kaby76 commented 1 year ago

Check "." is included in the classpath search path.

java --version
alias antlr4
antlr4
alias grun
grun
abulka commented 1 year ago

Hmm, seems to work this morning - sorry about that.

I actually want to expose the parser via an API in a springboot web server. Presumably I can just incorporate target/typescript-1.0-SNAPSHOT.jar and drive the parsing by calling something inside that jar? Are there any examples of this or documentation - or do I have to buy the Antlr book?

kaby76 commented 1 year ago

The .jar file contains the compiled, generated parser and lexer code, no driver. You shouldn't need the book to write a driver for your server if you are not doing much with the grammar itself. An example driver is in Tomassetti's web site.

abulka commented 1 year ago

Thanks - I seem to be able to use the typescript jar file with the following driving code I nutted out. In a new directory create src/java/Main.java

import java.io.File;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import org.antlr.v4.runtime.CommonTokenStream;
import org.antlr.v4.runtime.ANTLRInputStream;
import org.antlr.v4.runtime.ParserRuleContext;

public class Main {
    private static String readFile(File file) throws IOException {
        byte[] encoded = Files.readAllBytes(file.toPath());
        return new String(encoded, StandardCharsets.UTF_8);
    }

    public static ParserRuleContext parse(File file) throws IOException {
        String code = readFile(file);
        TypeScriptLexer lexer = new TypeScriptLexer(new ANTLRInputStream(code));
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        tokens.fill();
        TypeScriptParser parser = new TypeScriptParser(tokens);
        parser.setTokenStream(tokens);
        parser.setTrace(false);
        ParserRuleContext tree = parser.program(); // invoke the entry point of our grammar
        System.out.println(tree.toStringTree(parser)); // print LISP-style tree
        return tree;
    }

    public static void main(String[] args) throws Exception {
        File file = new File("/Volumes/SSD/Data/Devel/grammars-v4/javascript/typescript/examples/Class.ts");
        ParserRuleContext tree = parse(file);
        System.out.println("parsing complete");
    }
}

Which I can run OK with

javac -d out src/java/Main.java -cp "/usr/local/lib/antlr-4.12.0-complete.jar:/Volumes/SSD/Data/Devel/grammars-v4/javascript/typescript/target/typescript-1.0-SNAPSHOT.jar" -Xlint:deprecation

java -cp "out:/usr/local/lib/antlr-4.12.0-complete.jar:/Volumes/SSD/Data/Devel/grammars-v4/javascript/typescript/target/typescript-1.0-SNAPSHOT.jar" Main

A couple of issues though

  1. Is there a way to avoid ANTLRInputStream in org.antlr.v4.runtime has been deprecated
  2. If I move Main.java into a named package e.g. src/java/org.example/Main.java and add package org.example; to the top of Main.java I can no longer access the Lexer etc. classes in the jar.
javac -d out src/java/org/example/Main.java -cp "/usr/local/lib/antlr-4.12.0-complete.jar:/Volumes/SSD/Data/Devel/grammars-v4/javascript/typescript/target/typescript-1.0-SNAPSHOT.jar" -Xlint:deprecation

src/java/org/example/Main.java:19: error: cannot find symbol
        TypeScriptLexer lexer = new TypeScriptLexer(new ANTLRInputStream(code));
        ^
  symbol:   class TypeScriptLexer
  location: class Main2
src/java/org/example/Main2.java:19: error: cannot find symbol
        TypeScriptLexer lexer = new TypeScriptLexer(new ANTLRInputStream(code));
                                    ^
  symbol:   class TypeScriptLexer
  location: class Main

        TypeScriptLexer lexer = new TypeScriptLexer(new ANTLRInputStream(code));
                                                        ^
src/java/org/example/Main.java:22: error: cannot find symbol
        TypeScriptParser parser = new TypeScriptParser(tokens);
etc

The generated typescript-1.0-SNAPSHOT.jar contains:

META-INF/
META-INF/MANIFEST.MF
META-INF/maven/
META-INF/maven/org.antlr.grammars/
META-INF/maven/org.antlr.grammars/typescript/
TypeScriptParser$SourceElementContext.class
TypeScriptParser.class
TypeScriptLexer.tokens
etc.

I'm not a Java expert but after quite a bit of digging, it seems that java doesn't support this: viz Classes in named packages (org.example Main.java) cannot access classes in the default package (root of jar). See my Stack Overflow question.

If this java 'limitation' is true (and clearly I could be missing something) it seem strange to me that a 3rd party library would build a jar file like this, putting all the useful classes in the root of the jar, thereby making them inaccessible to be used by java files in named packages?

abulka commented 1 year ago

Turns out it is true: java files in named packages cannot access classes in the unnamed root package (whether they be in your project or in a jar). Thus the classes generated in target/typescript-1.0-SNAPSHOT.jar are inaccessible to Java files in named packages. They are only accessible if your Java file is in the default package of your project, viz at the root.

Luckily I managed to manipulate the resulting jar and documented the process here using the Maven Shade Plugin Tool.

To move the classes within the target/typescript-1.0-SNAPSHOT.jar from the root (unnamed, default) package into a named package (e.g. org.example) add this fragment to your javascript/typescript/pom.xml inside the <project>/<build>/<plugins> tag.

<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-shade-plugin</artifactId>
    <version>3.4.1</version>
    <configuration>
        <relocations>
        <relocation>
            <pattern></pattern>
            <shadedPattern>org.example.</shadedPattern>
            <includes>
            <include>TypeScriptParser*</include>
            <include>TypeScriptLexer*</include>
            </includes>
        </relocation>
        </relocations>
    </configuration>
    <executions>
    <execution>
        <phase>package</phase>
        <goals>
        <goal>shade</goal>
        </goals>
    </execution>
    </executions>
</plugin>

To match the classes in the root of the package, I used <pattern></pattern> though there may be a better way. Also, for my use all the root classes started with TypeScriptParser or TypeScriptLexer so I included those explicitly.

Running this on my build with mvn package did introduce extra classes from my project into the jar that I didn't need, so I had to delete them manually (by unzipping/zipping using jar).