eclipse-archived / ceylon

The Ceylon compiler, language module, and command line tools
http://ceylon-lang.org
Apache License 2.0
395 stars 62 forks source link

Unable to parse grouped type as a type #7413

Open lucaswerkmeister opened 5 years ago

lucaswerkmeister commented 5 years ago

It seems the parser’s type rule can no longer parse direct grouped types. I’m getting a test failure in ceylon.ast, which can be reproduced with this shorter code:

import ceylon.ast.redhat {
    parseType
}

shared void run() {
    print(parseType("<String>"));
}

parseType is just a fairly simple wrapper function:

shared Type? parseType(String code, Anything(JNode, Node) update = noop) {
    if (exists jType = createParser(code).type()) {
        return typeToCeylon(jType, update);
    } else {
        return null;
    }
}

And typeToCeylon can never return null, so it must be that the parser returns null for this code, perhaps because something goes wrong with the lookahead here:

type returns [StaticType type]
    @init { TypeConstructor ct=null; }
    : (typeParameters (TYPE_CONSTRAINT|COMPUTE)) =>
      typeParameters
      { ct = new TypeConstructor(null);
        ct.setTypeParameterList($typeParameters.typeParameterList);
        $type = ct; }
      (
        anonymousTypeConstraints
        { ct.setTypeConstraintList($anonymousTypeConstraints.typeConstraintList); }
      )?
      COMPUTE
      entryType
      { ct.setType($entryType.type); }
    | entryType
      { $type=$entryType.type; }
    ;

Using the entryType or unionType rules, on the other hand, correctly parses the type.

git bisect points to 5c80f1de3ebc4fb82e2bd59d491913384b6aa17f as the culprit.

lucaswerkmeister commented 5 years ago

git bisect points to 5c80f1de3ebc4fb82e2bd59d491913384b6aa17f as the culprit.

Specifically, just this part of the Ceylon.g diff is enough to produce the problem. (Which is a part not directly related to the introduction of the new pipeline operators, as far as I can tell.)

jvasileff commented 5 years ago

Wild guess: is the parser a) not being asked to match a final eof, and b) in this particular scenario just being too lazy to bother to parse your entire document?

gavinking commented 5 years ago

Specifically, just this part of the Ceylon.g diff is enough to produce the problem.

Weird. That bit of the grammar doesn't deal with type expressions at all.

gavinking commented 5 years ago

Wild guess: is the parser a) not being asked to match a final eof, and b) in this particular scenario just being too lazy to bother to parse your entire document?

But if @lucaswerkmeister calls the type() rule directly, I don't see why it would want to look for an EOF.

jvasileff commented 5 years ago

I don't see why it would want to look for an EOF.

Right. My wild guess is that it isn't looking for one, but that a rule TypeWithEof might work.

lucaswerkmeister commented 5 years ago

Well, I already tweak the code for some other grammatical constructs.

shared Condition? parseCondition(String code, Anything(JNode, Node) update = noop) {
    if (exists jCondition = createParser(code + ",").condition()) {
        // the parser needs that comma for some conditions                                                                                                                                                                                                                                                                                                 
        return conditionToCeylon(jCondition, update);
shared RequiredParameter? parseRequiredParameter(String code, Anything(JNode, Node) update = noop) {
    if (exists jParameterDeclarationOrRef = createParser(code + ",").parameterDeclarationOrRef()) {
        /*                                                                                                                                                                                                                                                                                                                                                 
         The parser does some lookahead and seems to need that comma to parse a parameterRef                                                                                                                                                                                                                                                               
         */

I guess I’ll try that here as well.

lucaswerkmeister commented 5 years ago

Yep, adding a comma is enough to make the test pass. And since I already do that in several other places, I guess there’s no real justification for considering this one a bug, except that it happens to be a regression for some bizarre, presumably ANTLR-internal reason (as @gavinking notes, the grammar changes appear unrelated).

Feel free to close the issue unless you want to investigate further…

gavinking commented 5 years ago

@lucaswerkmeister are you sure you're forcing ANTLR to fully-read the token stream?

lucaswerkmeister commented 5 years ago

I’m not familiar enough with ANTLR to be sure of anything :) but here’s my createParser function:

"Creates a [[Ceylon parser|CeylonParser]] that operates on the given [[code]]."
shared CeylonParser createParser(String code)
        => CeylonParser(CommonTokenStream(CeylonLexer(ANTLRStringStream(code + " ")))); // the lexer seems to need the trailing space                                                                                                                                                                                                                      

Do I need to do anything special to close the token stream at the end or something?

gavinking commented 5 years ago

Do I need to do anything special to close the token stream at the end or something?

Hrmph, apparently not. It's strange, I seem to remember something about having to force materialization of the whole token stream, but I don't see anything special in my code.