Extend CREATE/DROP to support IF EXISTS modifiers

caleb531 commented 7 years ago

Some SQL dialects (particularly MySQL) support a very handy syntax for dropping and creating conditionally via the IF NOT EXISTS and IF EXISTS modifiers (respectively).

However, Atom does not currently tokenize these modifiers when used in CREATE and DROP statements:

This PR extends the CREATE and DROP statements to support the IF NOT EXISTS and IF EXISTS modifiers:

You'll notice my changes also have the added benefit of making the CREATE and DROP tokenization a bit more robust (see the CREATE TABLE; statement without arguments, as shown above).

To accompany these changes, I've added three new (passing) unit tests, and all existing tests continue to pass. Hooray for unit tests!

Anyway, thanks for considering this; please let me know if I need to make any further modifications. Caleb

50Wliu commented 7 years ago

A few things. In general, spaces should never be tokenized unless the expression doesn't make sense without the space (such as I guess UNIQUE INDEX). That means if, not, and exists should all be tokenized separately.

Quotes should also get their separate scopes: punctuation.definition.string.begin.sql, string.quoted.<type>.sql, punctuation.definition.string.end.sql.

caleb531 commented 7 years ago

Alright, @50Wliu, I've updated the grammar (and tests accordingly) with your requested changes. Let me know if it looks good to you, or if there's anything more I can do.

50Wliu commented 7 years ago

Upon further inspection, since it seems like language-sql consistently tokenizes the spaces, you can revert those changes. They make more sense in a different PR.

As for the quotes, you're still missing string.quoted.<type>.sql for the actual quoted section. There's two ways to add this:

Add a few or-captures at the end of the existing match that capture either a) no quotes, b) single quotes, or c) double quotes.
Create two new captures that are dedicated for single quotes and double quotes.

caleb531 commented 7 years ago

Alright, I've reverted the individual keyword tokenization as you requested.
I'm thinking it would actually be simpler (and more consistent with the rest of the grammar) not to tokenize the table name at all within CREATE and DROP statements. Other SQL statements like SELECT and DELETE FROM don't seem to tokenize the table name. I don't want to duplicate those the string patterns in the grammar for CREATE and DROP if most other SQL statements in the grammar don't bother tokenizing the table name anyway.

Therefore, I ask if I can remove the table name (along with its quotes) from the patterns for CREATE and DROP, because that would keep the redundancy in the grammar from growing any more than it needs to in this regard. I don't know if I can reuse the existing string definitions in the grammar in any way, but it doesn't seem possible as far as I can tell.

Current:

  {
    'captures':
      '1':
        'name': 'keyword.other.create.sql'
      '2':
        'name': 'keyword.other.sql'
      '4':
        'name': 'keyword.other.DML.sql'
      '6':
        'name': 'entity.name.function.sql'
    'match': '(?i:^\\s*(create(?:\\s+or\\s+replace)?)\\s+(aggregate|conversion|database|domain|function|group|(unique\\s+)?index|language|operator class|operator|rule|schema|sequence|table|tablespace|trigger|type|user|view)(?:\\s+(if\\s+not\\s+exists))?)(?:\\s+([\'"`]?)(\\w+)\\5)?'
    'name': 'meta.create.sql'
  }

Proposed change (would apply to DROP as well):

  {
    'captures':
      '1':
        'name': 'keyword.other.create.sql'
      '2':
        'name': 'keyword.other.sql'
      '4':
        'name': 'keyword.other.DML.sql'
    'match': '(?i:^\\s*(create(?:\\s+or\\s+replace)?)\\s+(aggregate|conversion|database|domain|function|group|(unique\\s+)?index|language|operator class|operator|rule|schema|sequence|table|tablespace|trigger|type|user|view)(?:\\s+(if\\s+not\\s+exists))?)'
    'name': 'meta.create.sql'
  }

caleb531 commented 7 years ago

Pardon me for asking, @50Wliu, but do you have any thoughts on what I have proposed above?

50Wliu commented 7 years ago

Sorry, looks like I missed that comment. In general I'm :-1: on removing syntax highlighting, and that applies here as well. I think in this case I'm going to risk moving forward with the redundancy, but not to add the dedicated string highlighting.

So there's two things left to do:

Fix the second instance of keyword.other.create.sql to read keyword.other.drop.sql.
Something I just noticed is that with your changes this will now be highlighted: CREATE TABLEOHNO (same with drop). ...view)\\s+(?:if\\s+not\\s+exists\\s+)?)(?:([\'"(backtick)]?) looks like it should work. Please add a spec for that as well.

caleb531 commented 7 years ago

Fair enough. Thanks for the prompt response! :)

I swear I already made that change at some point, but it seems not. I've pushed up the tweak now
Great catch, especially since the other SQL keywords seem to un-tokenize if extra word characters are present. However, your suggested code makes the space mandatory, which breaks highlighting for standalone statements such as CREATE TABLE;. No need to fret, though—\\b to the rescue! Here's a screenshot of the updated syntax highlighting (the code has already been pushed—tests included):

50Wliu commented 7 years ago

Thanks!

atom / language-sql

Extend CREATE/DROP to support IF EXISTS modifiers #50