atom / first-mate

TextMate helpers
http://atom.github.io/first-mate
MIT License
91 stars 57 forks source link

Grammar pattern matching ignoring 'end' rule #52

Open garborg opened 9 years ago

garborg commented 9 years ago

A line matches the 'begin' pattern of two rules. It doesn't match the 'end' pattern for the first rule and does match the 'end' pattern for the second rule. I'd expect the second rule to win, but the first rule does.

Example at end.

Is this expected, and if so, any ideas on workarounds given that the contents between 'begin' and 'end' must match '$source' (or a large subset thereof)?

test-spec.coffee

describe "Test grammar", ->
  grammar = null
  beforeEach ->
    waitsForPromise ->
      atom.packages.activatePackage("language-test")
    runs ->
      grammar = atom.grammars.grammarForScopeName("source.test")

  it "parses the grammar", ->
    expect(grammar).toBeDefined()
    expect(grammar.scopeName).toBe "source.test"

 # this fails
  it "does what I want", ->
    {tokens} = grammar.tokenizeLine("f(x)yy")
    console.log(tokens)
    expect(tokens[0]).toEqual value: "f", scopes: ["source.test", "fcall.test", "support.function.test"]
    expect(tokens[1]).toEqual value: "(", scopes: ["source.test", "fcall.test", "paren.open.call.test"]
    expect(tokens[2]).toEqual value: "x", scopes: ["source.test", "fcall.test"]
    expect(tokens[3]).toEqual value: ")", scopes: ["source.test", "fcall.test", "paren.close.call.test"]
    expect(tokens[4]).toEqual value: "yy", scopes: ["source.test"]

 # this passes
  it "does what I don't want", ->
    {tokens} = grammar.tokenizeLine("f(x)yy")
    console.log(tokens)
    expect(tokens[0]).toEqual value: "f", scopes: ["source.test", "fdecl.test", "entity.name.function.test"]
    expect(tokens[1]).toEqual value: "(", scopes: ["source.test", "fdecl.test", "paren.open.decl.test"]
    expect(tokens[2]).toEqual value: "x)yy", scopes: ["source.test", "fdecl.test"]

test.cson

fileTypes: [
  "tst"
]
name: "Test"
patterns: [
  {
    include: "#function_decl"
  }
  {
    include: "#function_call"
  }
]
repository:
  function_call:
    begin: "([[:alpha:]_][[:word:]!]*)(\\()"
    beginCaptures:
      "1":
        name: "support.function.test"
      "2":
        name: "paren.open.call.test"
    end: "(\\))"
    endCaptures:
      "1":
        name: "paren.close.call.test"
    patterns: [
      {
        include: "$self"
      }
    ]
    name: "fcall.test"
  function_decl:
    begin: "([[:alpha:]_][[:word:]!]*)(\\()"
    beginCaptures:
      "1":
        name: "entity.name.function.test"
      "2":
        name: "paren.open.decl.test"
    end: "(\\))(\\s*=)"
    endCaptures:
      "1":
        name: "paren.close.decl"
      "2":
        name: "keyword.operator.update.test"
    patterns: [
      {
        include: "$self"
      }
    ]
    name: "fdecl.test"
scopeName: "source.test"
garborg commented 9 years ago

Cloning https://github.com/garborg/atom-language-test, apm link, atom ., & cmd+alt+ctrl+p replicates it for me. Thanks for looking into this. With the issue, I seem to be left with some ugly, brittle workarounds, so I'd really like to know if it should work as I had expected.

garborg commented 9 years ago

Apologies, if you cloned my repo and the and had trouble testing -- I pushed the package.json now so apm link gets the package name right.

garborg commented 9 years ago

In case the previous example isn't simple enough for a quick response, here's a minimal example (the boots branch of https://github.com/garborg/atom-language-test):

test.cson

#...
patterns: [
  {
    include: "#p1"
  }
  {
    include: "#p2"
  }
]
repository:
  p1:
    begin: "cat"
    end: "hat"
    name: "p1.hat.test"
  p2:
    begin: "cat"
    end: "boots"
    name: "p2.boots.test"

test-spec.coffee

# ...
  it "passes the baseline", -> # passes
    {tokens} = grammar.tokenizeLine("catinhat")
    console.log(tokens)
    expect(tokens[0]).toEqual value: "cat", scopes: ["source.test", "p1.hat.test"]
    expect(tokens[1]).toEqual value: "in", scopes: ["source.test", "p1.hat.test"]
    expect(tokens[2]).toEqual value: "hat", scopes: ["source.test", "p1.hat.test"]

  it "understands 'inboots' doesn't contain 'hat' but contains 'boots'", -> # fails
    {tokens} = grammar.tokenizeLine("catinboots")
    console.log(tokens)
    expect(tokens[0]).toEqual value: "cat", scopes: ["source.test", "p2.boots.test"]
    expect(tokens[1]).toEqual value: "in", scopes: ["source.test", "p2.boots.test"]
    expect(tokens[2]).toEqual value: "boots", scopes: ["source.test", "p2.boots.test"]
garborg commented 9 years ago

Context: This is coming up improving the grammar for Julia, where in between 'begin' and 'end', the contents may be nearly arbitrary Julia.

Without a way around this, I can't see how to consistently differentiate between, say, a function call (f([...])) and a one-line function definition (f([...]) = [...]), or between a parameterized type (S{[...]}), constructing a parameterized type (S{[...]}([...])), and a one-line, parameterized method definition (f{[...]}([...]) = [...]).

Is ignoring 'end' like this expected?

Thanks.

winstliu commented 9 years ago

Reproduced. Also, as expected, if you move p2 to become the first pattern, then test 1 will fail and test 2 will pass.

garborg commented 9 years ago

Is there anything I can do to help this along?

I don't spend too much of my time in JS, but if someone pointed me to the right place to start, I'd be happy to take a look, with the caveat that if finding a fix meant stealing more developer time, I'd throw it back rather than creating extra work for everyone.

garborg commented 9 years ago

Are any collaborators on this project planning to dig into this? Thanks, Sean

sglyon commented 9 years ago

Ping. I could also take a look and work under the same conditions that @garborg laid out. I just don't know where to look

Ingramz commented 9 years ago

https://manual.macromates.com/en/language_grammars#language_rules

The behavior that it only matches the first rule only is by design.

garborg commented 9 years ago

Thanks, @Ingramz. I suppose this is a dead end then. It will unfortunately keep us from having reliable highlighting, etc., unless/until a more powerful grammar is supported in Atom.

Here's what I found on that topic: atom/first-mate#50 atom/atom#8669 atom/first-mate#34

Perhaps this close this once someone in the know links any other relevant discussions?