jeff-hykin / better-cpp-syntax

💾 The source of VS Code's C++ syntax highlighting
GNU General Public License v3.0
156 stars 29 forks source link

Publish a textmate gem #290

Open jeff-hykin opened 5 years ago

jeff-hykin commented 5 years ago

I'm creating this as a checklist for publishing the gem

matter123 commented 5 years ago

I am going to make it a Hash storing the seed, and use the line number of the call to determine if this is a different grammar in the same file, and if so, complain.

This allows, if needed, for load to be used, while still limiting it to one export/partial Grammar per file.

jeff-hykin commented 5 years ago

Okay, sounds good to me.

jeff-hykin commented 5 years ago

it'll probably be saturday before I make changes, but I think I'm going to pull out the other languages and have this repo only contain C++ and the TextMate tools. If we publish a gem, it is going to be a pain to test the library because it will need to be published, then pulled into C++ the tested. That or some more complicated workaround is going to be needed.

So I'll just leave the other languages frozen with an old version of the TextMate tools. That way this repo can make updates to the library and C++ until the library is pretty stable. Since I've already missed the deadline of school starting for me, I'm fine to focus more on C++, and hopefully get the library stable by Christmas. I think doxygen and a bailout for the macro conditionals are the next major changes

matter123 commented 5 years ago

So this is a rough outline of how I imagined internal bailout would work.

Interface

grammar.add_bailout(
  name: "macro",
  pattern: lookBehindToAvoid("\\").then("\n"),
  # or end_pattern, while_pattern if separate patterns are required
  priority: 1 # optional defaults to zero
)

when save_to is called, each bailout (sorted by priority then declaration):

  1. makes a copy of each PatternRange
  2. Adds those patterns and reference to any match patterns to "#{bailout_name}_initial_context".to_sym
  3. rewrite includes to refer to the bailed-out patterns

The grammar should also calculate what entries need to be made bailout safe so that it doesn't have to be recalculated if there are multiple bailouts.

matter123 commented 5 years ago

Alternatively, any match patterns that are added to :$initial_context could instead be added to match_initial_context and then any initial_contexts could then just include that to save space.

jeff-hykin commented 5 years ago

I was thinking the bailout would return a hash (a textmate pattern) that utilizes the "repository": key to prevent name collisions. I haven't fully tested the "repository": key though. Something like:

grammar[:bailed_out_enum_context] = grammar.bailout(
    context: grammar[:enum_context], # an array of patterns
    pattern: lookBehindToAvoid("\\").then("\n"),
)
# bailout recursively goes down the `includes`

# returns something like
{
    repository: {
        # this is the bailed out enum_member, but it should be safe for it to have the same name 
        # because its in a repository 
        enum_member: {
            match: /stuff/
        }
    }
    includes: [
        # defaults (hopefully) to the closest defintion: aka the bailed out enum_member
        :enum_member,
    ]
}

I don't think priority/order of the bailouts will be an issue

jeff-hykin commented 5 years ago

Also I created repos for shell/dockerfile/perl. Not sure if seperate ones should be created for Objc, Objcpp, and C. Anyways, those other languages can be safely deleted now and should help with compile time performance

matter123 commented 5 years ago

That sounds reasonable, but we still want to defer generation so that patterns added after are still bailed out. The fact that repository is not documented anywhere is concerning, however.

matter123 commented 5 years ago

The rewrite branch now has the rewrite in the top-level directory gem.

Edit: all methods and classes in the directory gem/lib have at least some form of documentation (use yardoc to generate/view).

jeff-hykin commented 5 years ago

all methods and classes in the directory gem/lib have at least some form of documentation

Awesome! yardoc sounds like the way to go.

Using "textmate-grammar" as the gem name works for me.

so that patterns added after are still bailed out.

I was thinking baliout would be done inline (performed on-call) but that does present problems if it is called before some of the repo-names (e.g. :comment ) have been assigned to a pattern. ex:

macro_context = bailout(grammar[:$initial_context])
grammar[:comment] = Pattern.new()

This would make imported code (like preprocessor) less modular since they would need a secondary function to be called near the end of the grammar definition to setup the bailed-out code.

However, it could be difficult to do the bailouts in an automatic way at the end of a grammar export. If bailoutA includes[ :comment, :bailoutB ] and then bailoutB includes[ :preprocessor, :bailoutA] then there would need to be 4 bailouts.

  1. pure bailoutA (the top level one that's included within most things)
  2. pure bailoutB ^
  3. bailoutA + B
  4. bailoutB + A

the grammar uses the pure bailouts, but then all 4 of the bailouts would :include the A+B or B+A versions so that their internal patterns always closed early.

Since this case is pretty complicated, we could have two systems. 1. A manual bailout that gets called inline, leaving the complexity/order up to the user and then 2. A bailout that is a placeholder, which is converted to a real pattern when the grammar is exported to json. However, the bailout placeholder wouldn't be allowed to include (directly or indirectly) other bailout placeholders.

The fact that repository is not documented anywhere

Yeah... gotta love TextMate's documentation 🙃

matter123 commented 5 years ago

My thought was the global bailouts would just be processed in order so that later bailout patterns (or of higher priority) would be able to bailout earlier ones but not Visa-Versa. In effect 3 is folded into 2 and 4 doesn't exist.

Alternatively, there could just be a single global bailout pattern.

matter123 commented 4 years ago

Thinking some more, I think bailouts can be done entirely outside of the Grammar object (assuming Grammar is enumerable).

add_early_bailout_to_grammar(grammar, bailout_pattern, prefix) would enumerate the repository and any pattern ranges would be duplicated, the bailoutpattern added, then stored as `grammar[(prefix+""+name.to_s).to_sym]`

So right before saving the generate file could call that.

More consistent with what we already have: Grammar.add_transform(Bailout.new(prefix, pattern))

That performs basically the same steps as textmate_bailout

jeff-hykin commented 4 years ago

@matter123 Yesterday I spent the whole day trying to add TokenPattern as a class that inherited from PlaceholderPattern . TokenPattern was just going to override the resolve method, but doing that caused a complex shared infinite recursion/loop. It was somehow related to __deep_clone__ so I started working on printing and simplifying the cloning logic, but that became pretty difficult as well. I think the bug was related to the deep cloning logic of Placeholder, insert, and __deep_clone_self__ but I never found it.

After I rewrote half of it, I understood the logic of the system and basically just made a change to the resolve_placeholders.rb transform. It expands out the resolve logic (resolve() is basically unused now), and adds the logic for resolving TokenHelpers.

Even though all tests pass, these changes shouldn't be on master so I'll put them on another branch and revert them later today. Once the code is cleaned up, I'll merge it back in.