mchakravarty / CodeEditorView

SwiftUI code editor view for iOS, visionOS, and macOS
Apache License 2.0
697 stars 63 forks source link

Case-insensitive reserved identifiers #112

Closed barnettben closed 2 weeks ago

barnettben commented 1 month ago

I was experimenting with adding a language configuration for SQL, and found that the reserved identifiers are case-sensitive. I would like to be able to specify them as case-insensitive.

I am happy to submit a PR for this if it's something that would be useful, but am not sure of the preferred approach.

Possible options:

Is this something you would be open to changing and if so, would you have a preferred method?

mchakravarty commented 1 month ago

@barnettben Thanks for the suggestion! I am certainly happy to add support for case-insensitive keywords. I would prefer your third option: adding a new property. I think, it is useful to keep the array of keywords as strings, as this can be useful for other purposes as well.

I have two small suggestions regarding the interface:

  1. I think, caseInsensitiveReservedIdentifiers: Bool (without "supports") sounds better in this case and the array of keywords is called reservedIdentifiers. (We also have reserved symbols.)
  2. Please add a default (false) for the new property in the initialiser. In that way, old code will continue to work.

There is also a complication in the implementation of this feature. If you look at https://github.com/mchakravarty/CodeEditorView/blob/31c87f788bf948f158e9e22f0e537796b0566dd9/Sources/LanguageSupport/LanguageConfiguration.swift#L508 you will see that keywords (reserved identifiers) use a singleLexeme argument to TokenDescription. This is important as it allows for a regular expression with much fewer capture groups and speeds up the tokeniser. Now, I think, that we can continue to use this optimisation in the special case of the lexemes only varying in case. However, this will require some change in the tokeniser:

Given that this is somewhat subtle,

BTW, if you are also willing to contribute your SQL language configuration, I would gladly add it to the project as well.

barnettben commented 1 month ago

Thank you for the positive and detailed response!

I will aim to have a go at this over the weekend or early next week.

ericzakariasson commented 3 weeks ago

hey @barnettben i'm interested in SQL support too, so glad i found this. anything i can help you with?

barnettben commented 3 weeks ago

Hi @ericzakariasson, thanks for the offer. At the moment my efforts consist of just a list of keywords, so it's not like there's a lot there!

I'm going to focus specifically on the SQLite dialect, rather than any wider standard, so it might be that our interests don't overlap. When I get a bit of time to look closer, I will post here so that you can see any progress.

mchakravarty commented 3 weeks ago

@barnettben and @ericzakariasson How about opening a new issue for SQL(ite) syntax support?

After all, the goal of this issue (namely support for case-insensitive reserved identifiers) has been met with @barnettben's PR.

BTW, if @ericzakariasson is interested in full SQL (are you?) is it maybe possible to have a common set of definitions for the overlap between the two and then two separate language configurations for SQLite and SQL, which build on that common core.

ericzakariasson commented 2 weeks ago

I've never worked with Swift until last Friday 😅 I don't feel comfortable enough to start working in this code

Here's the Postgres code I've written so far.

import LanguageSupport
import Foundation
import RegexBuilder

private let postgresReservedIds = [
    "ALL", "ANALYSE", "ANALYZE", "AND", "ANY", "ARRAY", "AS", "ASC", "ASYMMETRIC",
    "AUTHORIZATION", "BINARY", "BOTH", "CASE", "CAST", "CHECK", "COLLATE", "COLLATION",
    "COLUMN", "CONCURRENTLY", "CONSTRAINT", "CREATE", "CROSS", "CURRENT_CATALOG",
    "CURRENT_DATE", "CURRENT_ROLE", "CURRENT_SCHEMA", "CURRENT_TIME", "CURRENT_TIMESTAMP",
    "CURRENT_USER", "DEFAULT", "DEFERRABLE", "DESC", "DISTINCT", "DO", "ELSE", "END",
    "EXCEPT", "FALSE", "FETCH", "FOR", "FOREIGN", "FREEZE", "FROM", "FULL", "GRANT",
    "GROUP", "HAVING", "ILIKE", "IN", "INITIALLY", "INNER", "INTERSECT", "INTO", "IS",
    "ISNULL", "JOIN", "LATERAL", "LEADING", "LEFT", "LIKE", "LIMIT", "LOCALTIME",
    "LOCALTIMESTAMP", "NATURAL", "NOT", "NOTNULL", "NULL", "OFFSET", "ON", "ONLY",
    "OR", "ORDER", "OUTER", "OVERLAPS", "PLACING", "PRIMARY", "REFERENCES", "RETURNING",
    "RIGHT", "SELECT", "SESSION_USER", "SIMILAR", "SOME", "SYMMETRIC", "TABLE", "THEN",
    "TO", "TRAILING", "TRUE", "UNION", "UNIQUE", "USER", "USING", "VARIADIC", "VERBOSE",
    "WHEN", "WHERE", "WINDOW", "WITH"
]

private let postgresReservedOperators = [
    "+", "-", "*", "/", "%", "=", "<>", "!=", "<", ">", "<=", ">=", "||",
    "<<", ">>", "&<", "&>", "<<|", "|>>", "&<|", "|&>",
    "->", "->>", "#>", "#>>", "@>", "<@", "?", "?|", "?&",
    "&&", "-|-", "~~", "~~*", "!~~", "!~~*", "@@@", "::", "."
]

extension LanguageConfiguration {

    /// Language configuration for PostgreSQL
    public static func postgres(_ languageService: LanguageService? = nil) -> LanguageConfiguration {
        // numeric types
        let numberRegex = /[+-]?(?:\d+(?:\.\d*)?|\.\d+)(?:[eE][+-]?\d+)?/

        // identifiers
        let identifierRegex = /[a-zA-Z_][a-zA-Z0-9_$]*|"[^"]+"/

        // operators
        let operatorRegex = /[+\-*\/<>=!|&%^~?#@:.]+/

        // standard quotes and dollar quoting
        let stringRegex = /'(?:[^']|'')*'|"(?:[^"]|"")*"|(?:\$[^$]*\$).*?/

        return LanguageConfiguration(
            name: "PostgreSQL",
            supportsSquareBrackets: true,
            supportsCurlyBrackets: false,
            stringRegex: stringRegex,
            characterRegex: nil,
            numberRegex: numberRegex,
            singleLineComment: "--",
            nestedComment: (open: "/*", close: "*/"),
            identifierRegex: identifierRegex,
            operatorRegex: operatorRegex,
            reservedIdentifiers: postgresReservedIds,
            reservedOperators: postgresReservedOperators,
            languageService: languageService
        )
    }
}
barnettben commented 2 weeks ago

As suggested, I've opened #116 to track the language configuration so that this issue can be closed.