amplify-education / python-hcl2

MIT License
255 stars 56 forks source link

v4.3.1 regression #133

Closed sidekick-eimantas closed 1 year ago

sidekick-eimantas commented 1 year ago

Hi

In v4.3.1 we started seeing parser failures on one of our files. We reduced the failing case to this:

locals {
  terraform = {
    channels = local.running_in_ci ? local.ci_channels : local.local_channels
    authentication = []
  }
}
(.venv) Eimantas@Eimantas-Gecass-MacBook-Pro skm-cli % python -c "import hcl2, pathlib; hcl2.loads(pathlib.Path('/Users/Eimantas/git/sidekick-money/skm-cli/examples/terraform/modules/terraform-context/terraform-config.tf').read_text())"
Traceback (most recent call last):
  File "/Users/Eimantas/git/sidekick-money/skm-cli/.venv/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 126, in feed_token
    action, arg = states[state][token.type]
KeyError: '__ANON_3'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/Eimantas/git/sidekick-money/skm-cli/.venv/lib/python3.10/site-packages/hcl2/api.py", line 27, in loads
    tree = hcl2.parse(text + "\n")
  File "/Users/Eimantas/git/sidekick-money/skm-cli/.venv/lib/python3.10/site-packages/lark/lark.py", line 645, in parse
    return self.parser.parse(text, start=start, on_error=on_error)
  File "/Users/Eimantas/git/sidekick-money/skm-cli/.venv/lib/python3.10/site-packages/lark/parser_frontends.py", line 96, in parse
    return self.parser.parse(stream, chosen_start, **kw)
  File "/Users/Eimantas/git/sidekick-money/skm-cli/.venv/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 41, in parse
    return self.parser.parse(lexer, start)
  File "/Users/Eimantas/git/sidekick-money/skm-cli/.venv/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 171, in parse
    return self.parse_from_state(parser_state)
  File "/Users/Eimantas/git/sidekick-money/skm-cli/.venv/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 193, in parse_from_state
    raise e
  File "/Users/Eimantas/git/sidekick-money/skm-cli/.venv/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 184, in parse_from_state
    state.feed_token(token)
  File "/Users/Eimantas/git/sidekick-money/skm-cli/.venv/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 129, in feed_token
    raise UnexpectedToken(token, expected, state=self, interactive_parser=None)
lark.exceptions.UnexpectedToken: Unexpected token Token('__ANON_3', 'authentication') at line 4, column 5.
Expected one of: 
    * __ANON_8
    * PLUS
    * __ANON_9
    * PERCENT
    * MORETHAN
    * STAR
    * QMARK
    * LESSTHAN
    * __ANON_6
    * __ANON_7
    * __ANON_1
    * SLASH
    * __ANON_4
    * RBRACE
    * __ANON_2
    * __ANON_5
    * COMMA
    * __ANON_0
    * MINUS

Last working version was v4.3.0

Thanks

sodul commented 1 year ago

I can confirm the same issue on our side.

jqcorreia commented 1 year ago

Yes, I also can confirm this. 4.3.0 was parsing correctly a very large and heterogeneous set of terraform projects, and started failing on specific cases on update to 4.3.1

christokur commented 1 year ago

Experience the same with parsing main.tf and locals.tf of https://github.com/aws-ia/terraform-aws-eks-blueprints

ghost commented 1 year ago

Experiencing the same issue here. In case someone doesn't want to revert to a version older than 4.3.1, wrapping the ternary operation around a string is another way to get around it "${some_boolean ? opt_1 : opt2}", that is, if you're intending to get a string out of it.

Another solution is moving the problematic line to the very bottom of the block it's in.

IButskhrikidze commented 1 year ago

Also, it works if the ternary operation will be in parentheses. like this

locals {
  terraform = {
    channels = (local.running_in_ci ? local.ci_channels : local.local_channels)
    authentication = []
  }
}
guyb1997 commented 1 year ago

It's not a regression in the grammar, but the likelihood of a conditional (trinary) to be caught was reduced. changing conditional from:

conditional : expression "?" new_line_or_comment? expression new_line_or_comment? ":" new_line_or_comment? expression new_line_or_comment?

to:

conditional : expression "?" new_line_or_comment? expression new_line_or_comment? ":" new_line_or_comment? expression

Will solve the issue.

Another option is to use Earley (instead of LALR) parser, which doesn't have those issues. You can play around with it here (try and switch the parser)

guyb1997 commented 1 year ago

Btw Im playing around with grammar improvements here, also added support to choose Earley parser, if anyone is willing to review and add the improvements Im in

sodul commented 1 year ago

For reference I'm still getting lark.exceptions.UnexpectedToken: Unexpected token Token('__ANON_3', 'DD_SITE') at line 159, column 5. with 4.3.2.

ascopes commented 1 year ago

Can also confirm we can replicate this on 4.3.2.

Small reproduction:

# foo.tf
module "foobar" {
  attributes = {
    do_ray_me_far = var.foobar_thing > 1024.0
    blah_blah     = var.foobar_thing / 1024.0
  }
}
Unexpected token Token('__ANON_3', 'blah_blah') at line 4, column 5.
 | Expected one of:
 |  * __ANON_0
 |  * __ANON_7
 |  * __ANON_9
 |  * STAR
 |  * __ANON_6
 |  * PLUS
 |  * PERCENT
 |  * MINUS
 |  * __ANON_2
 |  * COMMA
 |  * MORETHAN
 |  * __ANON_4
 |  * __ANON_8
 |  * __ANON_5
 |  * QMARK
 |  * SLASH
 |  * LESSTHAN
 |  * RBRACE
 |  * __ANON_1
 |
ascopes commented 1 year ago

@IButskhrikidze please can we re-open this issue?

fergoid commented 1 year ago

This is still occurring on both v4.3.1. and v4.3.2. Can we reopen this issue ? It fails in my organisation on many different tf files, I just recreated it on my personal laptop with the code block from @ascopes