mogulargmbh / m-formatter

23 stars 3 forks source link

Error in column definition #15

Closed SharmaAntriksh closed 1 year ago

SharmaAntriksh commented 1 year ago

If you define column names as number in type table [] argument then the formatter considers that as an error whereas Power Query doesn't.

Code:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("Pcy5DcAwDEPRXVS7iZ1MY6jIfcf7dxYVgwWLBwI/Z+kk+EbR8CvaJirZZqq3LdRgW12xVTYK2ylUDgqVk0LlcqVWuSn8D4W9FCofhUoR1Qo=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t]),
    GroupedRows = 
        Table.Group ( 
            Source, 
            {"Column1"}, 
            {
                { 
                    "Transform", 
                    each 
                        Table.PromoteHeaders ( 
                            Table.Transpose ( 
                                _[[Column2], [Column3]] 
                            ) 
                        ), 
                        type table [1 = text, 2 = text, 3 = text, 4 = text, 5 = text] 
                }
            }
        ),
    ExpandedCount = 
        Table.ExpandTableColumn ( 
            GroupedRows, 
            "Transform", 
            {"1", "2", "3", "4", "5"}, 
            {"1", "2", "3", "4", "5"}
        )
in
    ExpandedCount

Not a burning issue or something that can't be avoided, but just thought of sharing.

UliPlabst commented 1 year ago

Hi, thank you for reporting this issue. It has been quite some time since I updated the powerquery-parser version and I suspect this is the issue. It has been on my list for some time to make a new maintenance release.
Hopefully I have some time tomorrow to investigate. I will report back as soon as the fix is deployed.

UliPlabst commented 1 year ago

I updated the parser now but this did not fix the issue so I did a little digging in the powerquery language specification. Here are the relevant grammar parts:

table-type:
      "table" row-type
row-type:
      "[" field-specification-list? "]"
field-specification-list:
      field-specification
      field-specification "," field-specification-list
field-specification:
      optional? field-name field-type-specification?
field-type-specification:       //this branch is not relevant
      "=" field-type
field-name:
      generalized-identifier
      quoted-identifier
generalized-identifier:
      generalized-identifier-part
      generalized-identifier separated only by blanks (U+0020) generalized-identifier-part
generalized-identifier-part:
      generalized-identifier-segment
      decimal-digit-character generalized-identifier-segment
generalized-identifier-segment:
      keyword-or-identifier
      keyword-or-identifier dot-character keyword-or-identifier
keyword-or-identifier:
      letter-character
      underscore-character
      identifier-start-character identifier-part-characters
letter-character:
      A Unicode character of classes Lu, Ll, Lt, Lm, Lo, or Nl
identifier-start-character:
      letter-character
      underscore-character
decimal-digit-character:                  //these are digits and would apply to your number as identifier
      A Unicode character of the class Nd

so as I understand it, a generalized-identifier may start with a decimal-digit-character but it must have at least one generalized-identifier-segment after it. So according to the language specification the code that you wrote is not valid. Unfortunately the implementation of the parser in Power-BI differs from the specification. I had several issues of this kind already where code was parsed in Power-BI but was not valid according to spec.

Solution

However to solve your problem, we can use the other branch of generalized-identifier-segment: the quote-identifier:

quoted-identifier:
      #" text-literal-charactersopt "

it's used to define complex identifiers that do not parse as a normal identifier (for example you can have spaces in there etc.). It looks like this (note the #"1")

 let
  Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("Pcy5DcAwDEPRXVS7iZ1MY6jIfcf7dxYVgwWLBwI/Z+kk+EbR8CvaJirZZqq3LdRgW12xVTYK2ylUDgqVk0LlcqVWuSn8D4W9FCofhUoR1Qo=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t]),
  GroupedRows =
      Table.Group (
          Source,
          {"Column1"},
          {
              {
                  "Transform",
                  each
                      Table.PromoteHeaders (
                          Table.Transpose (
                              _[[Column2], [Column3]]
                          )
                      ),
                      type table [#"1" = text, #"2" = text, #"3" = text, #"4" = text, #"5" = text]
              }
          }
      ),
  ExpandedCount =
      Table.ExpandTableColumn (
          GroupedRows,
          "Transform",
          {"1", "2", "3", "4", "5"},
          {"1", "2", "3", "4", "5"}
      )
in
    ExpandedCount

P.S.: 1b should parse as an identifier, but it does not, this is a bug in @microsoft/powerquery-parser i suspect.