Open Thom1729 opened 3 years ago
TLDR: I'm in favor of keeping delimiter scopes simple.
Various downsides:
The proposal seems to be about delimiters of literal data structures: lists, structs, maps, dicts, sets, plain objects, etc.
At a certain level, "data literals" are always syntactic shortcuts/aliases for function calls, sometimes with named arguments.
JS:
new Array(10, 20, 30) ≡ [10, 20, 30]
Python:
dict(one = 10, two = 20) ≡ {'one': 10, 'two': 20}
Swift:
struct A {
let one: Int
let two: Int
}
A(one: 10, two: 20)
[[[10]], [[20]]]
[10: 20]
Go:
type A struct {
One int
Two int
}
A{One: 10, Two: 20}
[][][]int{{{10}, {20}}}
map[int]int{10: 20}
Compare the Go and Swift structs. Swift supports named arguments. For structs, it auto-defines a constructor (hidden init()
method) with named arguments matching the field names, and thus avoids special syntax. Go structs are exactly the same: function calls with named arguments.
So before it gets anywhere, the proposal ought to decide how to handle function call delimiters, and/or propose a strong reason why data delimiters would be treated differently.
Lisp exemplifies a language with no meaningful way to differentiate "section" vs "definition" delimiters. All Lisp code is a literal data structure. By default, it's evaluated as code. If quoted, it's evaluated as data.
(print "hello world")
; "hello world"
(print '(print "hello world"))
; (PRINT "hello world")
Parens denoting a "block" also always denote a list. Consider the following:
'(
(
(let
(
(one 10)
(two 20)
)
(print one)
(print two)
)
)
)
In normal code, let
creates a block: a sub-scope with some inner variables. In this example, one of the enclosing forms was quoted, preventing such evaluation. The let
form might still get evaluated, possibly after getting modified! It exists in a superposition, neither definitively a block, nor merely a data structure.
In Go, the following cases are pertinent:
[]
as indexing operation vs part of a type ([8]int
, []int
, map[int]int
){}
as block vs data literal (array, slice, struct, map)The Sublime implementation already detects different uses of []
; any existing edge cases might be handleable with branching.
Handling {}
seems much trickier. Go allows "bare" {}
for some data literals:
type A = map[int]map[int][][]int
A{10: {20: {{30, 40}}}}
Whether the type before {}
can be elided depends on the specific type. Currently it's allowed for nested non-structs, but never for structs. The Go parser might be able to disambiguate this without relying on type information. Supporting this in Sublime might require the syntax to differentiate expression vs. statement context, significantly complicating the implementation. I'd like to avoid that.
Motivating example in JavaScript:
Overview
For brevity's sake, by “brackets” I mean curly braces, square brackets, parentheses, and other paired punctuation characters as appropriate.
Many languages use the same brackets to denote both sections of code (e.g. parenthesized expressions) and literal collections (e.g. mappings or sequences). Depending on the context and/or the text between the delimiters, the delimiters may represent very different syntactic constructs.
The current practice is to scope those brackets as
punctuation.section.*
in either case. This proposal suggests usingpunctuation.definition
instead for collection literals, while keepingpunctuation.section
for all other purposes.Considerations
Scope naming guidelines.
The scope naming guidelines are — somewhat surprisingly — silent on the issue.
They do say that “Sections of code delineated by” brackets should use certain meta scopes, and the brackets should use
punctuation.section.<section type>.begin|end
, where the section type might be either the bracket type (braces
,parens
, orbrackets
) or something semantic (block
orgroup
). They also specifypunctuation.section.interpolation.begin|end
where appropriate.However, the guidelines do not mention collection literals such as mappings, lists, or tuples. They might be considered “Sections of code delineated by” brackets, but I think that what the guidelines had in mind were code blocks, parenthesized expressions, and the like. The given semantic scopes are
block
andgroup
, notmapping
,sequence
, ortuple
. Moreover, the scope naming guidelines have always been rather C-centric, and C has no true collection literals. C does have array initializers — most commonly strings.The scope naming guidelines specify
punctuation.definition.string.begin|end
for string delimiters. This seems to be the only explicit guideline for punctuation defining literals. By analogy, brackets defining (e.g.) a mapping might bepunctuation.definition.mapping
.I interpret the scope naming guidelines to be compatible with either
punctuation.section
orpunctuation.definition
for collection literals.Ergonomics
Some color schemes, including Mariana, color
punctuation.definition
differently frompunctuation.section
. In languages with both scopes, this could be a helpful distinction. JavaScript is a perfect example here — curly braces and square brackets are both “overloaded” and can refer to either collection literals or other constructs depending on the syntactic context. A missing semicolon can often cause one to be interpreted as the other, leading to a bug. Highlighting brackets depending on their syntactic purpose would make the mistake legible at a glance. (See e.g. https://github.com/sublimehq/Packages/pull/1551.)On the other hand, every change is a change, and people don't always like change. It may be that using
punctuation.definition
would make some code less legible. Examples of this would be welcome.Established usage
Established usage is clearly on the side of
punctuation.section
. The specifics vary between syntaxes. For instance, the JavaScript syntax scopes object literal bracketspunctuation.section.block
(which is listed in the guidelines, but arguably incorrect for this construct), whereas JSON usespunctuation.section.mapping
(which is not in the guidelines). Python usespunctuation.section.mapping
,punctuation.section.set
, orpunctuation.section.mapping-or-set
. (This latter should probably be eliminated using branching.)Even if we stick with
punctuation.section
, I think we should standardize on a single subscope or set of subscopes.Implementation
In some languages, like JavaScript, using
punctuation.definition
for collection literals would be easy. (In JavaScript's case, this is because the syntax already has to make the distinction internally or everything would break.) In other languages, it might be more difficult. Lisp and Go have been suggested as languages for which the implementation might be difficult; examples would be welcome.Alternatives
If
punctuation.definition
is too large a change, we could instead standardize on a consistent set ofpunctuation.section.*
scopes, such aspunctuation.section.sequence
. This would also allow color schemes to target collection delimiters.