Should null-aware subscripting use `?[` or `?.[` syntax?

stereotype441 commented 5 years ago

The draft proposal currently extends the grammar of selectors to allow null-aware subscripting using the syntax e1?.[e2], however we've had some e-mail discussions about possibly changing this to e1?[e2], which would be more intuitive but might be more difficult to parse unambiguously.

Which syntax do we want to go with?

munificent commented 5 years ago

var wat = { a ? [b] : c };

Is this a set literal containing the result of a conditional expression, or a map literal containing the result of a null-aware subscript?

I think we'll probably want to do ?.[]. Also, the cascade is ..[], so this is arguably consistent with that.

leafpetersen commented 5 years ago

I believe the suggestion that @bwilkerson made was that ?[ is parsed as single token, and ? [ is parsed as two. So for your example:

var set = { a ? [b] : c };  // Set literal
var map = { a?[b] : c}; // Map literal

Note that Swift and C# both use ?[]. Swift seems to be able to correctly disambiguate between conditional expressions and null aware subscripts, but doesn't seem to use tokenization to do so.

    var x : Array<String>?;
    var t1 : String? = x?[0]; // Treated as a subscript
    var t2 : String? = x? [0]; // Treated as a subscript
    var t3 : Array<Int>? = x == nil ? [0] : [3];  //Treated as a conditional expression
    var t4 : Array<Int>? = x == nil ?[0] : [3];  //Treated as a conditional expression

lrhn commented 5 years ago

Swift does seem to use tokenization to distinguish, it's just whether there is a space between the x and then ? which matters, not between ? and [.

Whether to trigger on x?, ?[, or even x?[, without whitespace should probably be determined by where we want to break lines.

var v1 = longName?
   [longExpression];
var v2 = longName
   ?[longExpression];
var v3 = longName?[
  longExpression];

I can't see any one to prefer. So, what do we do for ?.?

var v4 = longName
    ?.longName();

That does suggests that we want ?[ to be the trigger, for consistency.

C# does not have the issue because [...] is not a valid expression.

jodinathan commented 5 years ago

I think the question mark aways close to the variable as subscript is better to read. x?

munificent commented 5 years ago

Leaf and I spent some time talking about this at the whiteboard. My take on it going in is that both options have some things going for them:

foo?[bar]:

Follows C# and Swift [EDITED: Kotlin doesn't have this operator]
Terse
Mirrors !: foo![bar]

foo?.[bar]:

Mirrors cascade: foo..[bar]
Mirrors other null-aware method syntax: foo?.bar()
Avoids the nasty ambiguity in: { foo?[bar]:baz }

We spent a while trying to come up with ways to avoid the ambiguity with ?[. A couple of them are probably workable, but none feel particularly great to me. In particular, relying on whitespace can really harm the user experience. In theory, it's not a problem in formatted code. But many users write un-formatted Dart code as an input to the formatter. And that input format would thus become more whitespace sensitive and brittle in this corner of the language. So far, those kind of corners are very rare in Dart, which is a nice feature. (The one other corner I recall offhand is that - - a and --a are both valid but mean different things.)

We talked about eventually adding null-aware forms for other operators: foo?.+(bar), etc. If we do that, we'll probably want to require the dot, in which case requiring it for subscript is consistent with that future.

Another addition we have discussed for NNBD is a null-aware call syntax. If we don't require a dot there, it has the exact same ambiguity problem:

var wat = { foo?(bar):baz }; // Map or set?

So whatever fix we come up with for the ?[ ambiguity, we'll also have to apply to ?(.

Finally, Leaf wrote up an example of chaining the subscript:

foo()?[4]?[5]

To both of us, that actually doesn't look that good. It scans less like a method chain and more like some combination of infix operators. A little like ??. Compare to:

foo()?.[4]?.[5]

Here, it's more clearly a method chain. Communicating that visually is important too, because users need to quickly understand how much of an expression will get null-short-circuited.

Putting all of that together, it seems like the ?.[ form:

Avoids ambiguity problems. (The lexer already treats ?. as a single "null-aware" token.)
Extends naturally to a null-aware call.
Extends to other null-aware operators.
Leaves Dart a more robust input language to the formatter.
Actually looks pretty OK in a method chain.

So we're both leaning towards ?.[. If users ask why we do a different syntax from Kotlin and Swift, I think it's easy for us to show the ambiguous case and explain that it's to avoid that.

leafpetersen commented 5 years ago

@lrhn I'm going to close this in favor of ?.[ since I think I was the only one still on the fence and I think I've moved into the ?.[ camp now. If you've come around to ?[, feel free to re-open for discussion.

lrhn commented 5 years ago

LGTM. I did not find the white-space based disambiguation tecniques convincing, they didn't fit well with the current Dart syntax, and I couldn't see any other reasonable way to disambiguate.

DaveShuckerow commented 5 years ago

Question: was any consideration given to the syntax map[?index] ?

This is simpler to remember (IMO) than map?.[index] and appears to avoid the ambiguity problem of wat = {map?[index]:value}.

lrhn commented 5 years ago

The map[?index] notation looks misleading. Reading it, I'd assume that it is checking whether the index is null, not the map.

(On the other hand, that could be a useful functionality by itself: If a function parameter or index operand starts with ?, then if it is null, all further evaluation of that call is skipped and it evaluates to the short-circuiting null value. Since a call or index operation is inside a selector chain, it could have the same reach as ?., and we wouldn't need something new. Probably doesn't work for operators, though.)

eernstg commented 5 years ago

@lrhn wrote:

If a function parameter or index operand starts with ?, then if it is null, all further evaluation of that call is skipped and it evaluates to the short-circuiting null value.

When that idea was discussed previously, the main concern was that it would be hard to read:

var x = ui.window.render((ui.SceneBuilder()
        ..pushClipRect(physicalBounds)
        ..addPicture(ui.Offset.zero, ?picture)
        ..pop())
    .build()),
};

How much of the above should be shorted away when the picture is null? An option which was discussed was to put the test at the front:

var x = let ?thePicture = picture in e;

This would cancel the evaluation of e entirely when thePicture is null. With that, there wouldn't have to be a conflict with map[?index] as a null-shorting invocation of operator [].

But I agree that a null-shorting semantics for map[?index] would be confusing, and I'd still prefer?.[.

Marco87Developer commented 5 years ago

Personally, I prefer ?.[.

bean5 commented 5 years ago

Definitely ?.. Be willing to be different than other languages. Be built well from the ground up. If you want to be like the other languages there's no point in having another.

morisk commented 5 years ago

In all (I am a were of)human languages question mark already include dot, this is just a repeating. Another redundant keystroke was removed.
Typing longer chaining would be an annoyance.
Things can get weird when chaining with ?. and ..
a[index] converted to a?.[index] looks wrong as a.[index]
Function nullability look completely wrong with myFunc?.()
Make Swift and C# developers at home could be a good goal. (I don't write C# just because of its weird pascal case notation) Being different is not necessarily a good thing here.

I am in favor of a?[index] and myFunc?()

spkersten commented 5 years ago

I find the arguments for ?.[] not very convincing.

Mirrors cascade: foo..[bar]

It behaves differently from cascading (types of the expression are type of foo vs type of foo[bar]) so I'd say it should not mirror it.

Mirrors other null-aware method syntax: foo?.bar()

I'd say it doesn't mirror this. Call syntax is foo.bar(). Making it null-aware adds just the question mark after foo, so mirroring this would be foo?[bar].

those kind of corners [operators where white space matters] are very rare in Dart

I found white space matters in Dart for: --, ++, &&, ||, !=, ==. Which are some very common operators, hardly a "corner".

foo()?.[4]?.[5] Here, it's more clearly a method chain.

But it is not a method chain, why should it look like one? bar[1][4] doesn't look like a method chain either.

In my opinion, the syntax should be consistent (adding a single ? to make something null-aware, instead of sometimes a ? and sometimes a ?.). Whether something "looks good" is personal and will probably change once you get used to the syntax.

lrhn commented 5 years ago

(Edit: Kathy said this all this better already: https://medium.com/dartlang/dart-nullability-syntax-decision-a-b-or-a-b-d827259e34a3)

The current plan is to go with ?.[e] as null-aware index-operator invocation (and ?.[e]=... for setting, and potentially ?.(args)/?.<types>(args) as null-aware function invocation).

A null aware cascade will be e?..selectors which means that we have e?..[e2] in the language already.

This syntax parses without any ambiguity, whether we require ?. to be one token or two. (We have not decided on that, it might be useful to make it one, but it may also disallow some formattings that others might want to do, like have x? on one line and .foo() on the next).

The alternative proposed here is to use e1?[e2] as null-aware index lookup. I agree that it could be easier on the eye, the arguments against it are mainly of concerns about complication of parsing and writing.

This does not parse unambiguously if ? and [ are treated as two tokens because {e1?[e2]:e3} parses as both a set literal and a map literal. So, if we try this, we will need some disambiguation, and it seems very likely that we'll have to treat ?[ as a single token, and ? [ as two tokens. (The other option is to check for space between e and ? in e?[...] vs e ?[...], which is unprecedented in Dart).

If we treat ?[ as a single token, then e ?[ e2 ] is a null-aware index lookup.

Currently you can write text?[1]:[2] and have the formatter convert it to text ? [1] : [2]. With a ?[ token, the formatter couldn't do that. We have other composite operators where inserting a space changes the meaning, but currently the only one where breaking the operator into two is still valid syntax is prefix --, and there is no use for - -x, so that doesn't matter in practice. All other multi-char operators would be invalid code with a space inside them, but both ?[ and ? [ could see serious use, so we raise the risk of accidentally writing something else than what you meant by omitting a space.

The ?[ operator would not work as well with cascades where e?..foo()..[bar]=2 is a null-aware cascade on e. It only checks e once. That makes it e?[foo] for direct access, but e?..[foo] for cascade access, not e..?[foo] as you might expect.

If we use ?[ for indexing, we should also use ?( for null-aware function invocation. That has all the same risks of ambiguity.

So, the arguments against ?[ is not that it doesn't look better (whether it does or not), but that the consequences and risk for the language are larger than for ?.[, and the benefits are not deemed large enough to offset that.

spkersten commented 5 years ago

@lrhn For clarity: Maybe too subtle of a difference, but my argument isn't about look better, but about consistency: If a user knows that foo.bar() can be made null-aware by adding a question mark after the possible-null-expression to make it foo?.bar(), their first try for foo() and foo[4] will be foo?() and foo?[4]. Of course you could say that the rule is "insert a question mark but make sure there is at least one period after it" and maybe that is only slightly less intuitive and good enough, but the article you're refer to asked for feedback, so I'm giving it :)

shortercode commented 5 years ago

Just a few semi on topic thoughts...

Comparing to other languages I can think of a couple of sticking points using "?." for optional chaining ( although neither of these cases actually clash with the proposed null aware subscripting operator ):

Implicit member expressions in Swift

let color: UIColor = condition ? .red : .blue

Numbers in JS, without the integer component

let value = condition ? .1 : .2

Referring to the "?[" option, I feel like the it would be possible to parse and differentiate from a ternary conditional. Do a speculative expression parse after a detected "?" token and then check if the token following the expression is a ":". It requires that you can rewind the token stream to a point prior to the speculative expression parse, and that if it failed there would be no side effects. I don't know enough about the structure of the Dart scanner/parser to say how feasible that it is but it seems like a lot of potential work.

In terms of plain personal preference I think I'd prefer ?[. As @spkersten says it's more intuitive. Realistically I think people will live with either, if ?.[ is less ambiguous to parse then go for it.

gazialankus commented 5 years ago

Sorry, my initial reaction is to support ?[. You have spent so much effort on this and are in a much better position to decide of course, but I will share my point of view. Hopefully it could be useful.

I think the decision to go with ?.[ feels too much like a system-centric approach rather than a user(programmer)-centric approach. Why the language is more complete and proper etc. etc. would repeatedly have to be explained to the average programmer who goes "what the heck is that extra dot for, am I not just supposed to add a question mark to protect against dereferencing a null?"

I think the main selling point of ?.[ seems to be the "is this a set or map?" example. I'm sure programmers would be happy memorizing one way the other, just like they memorized the operator precedence order. They could go "oh you can't just put a question mark like that because it sticks to the nearby nullable type, use a paranthesis there if you want to make it a ternary operator". And if they are coding somewhat responsibly and are not using dynamic everywhere, the IDE would warn them that it's a Map and not a Set. To have this ambiguous example be a critical bug you have to be coding irresponsibly anyway. Therefore, removing this ambiguity feels more like a theoretical exercise rather than a practical solution.

The second strongest argument, the congruence with cascade syntax is not that convincing to me either, because it's easier to remember "you always add a question mark after the nullable" rather than "you also have to add a dot after the question mark, because it has to look similar to cascade (which is not what we are using here, but it needs to look similar)". The dot feels like it came out of nowhere.

The third, chaining: if I am chaining an operator like this, I am probably already treading lightly that I might be making a mistake somewhere. If my life depends on it, I am probably using a number of final intermediary variables anyway. If not, since chaining already made me careful and nervous, I can probably correctly use the dotless operator with a little more of an effort. If I want to make it look nice, I can add whitespace.

Either case, thank you for introducing non-nullable types! It's a huge step forwards and I won't really mind the final decision here 😄.

bean5 commented 5 years ago

Hmm. I have a feeling that Dart is worry about this because it is used heavily by Flutter which is heavily used with Firebase. NoSQL engines and non-existent fields are so common it isn't even funny. If NoSQL is going to persuade your choice, please make it apparent that it is a key use case. Not saying it is, but if it is, I'd like to know.

I am having second thoughts against ?. now that I look at ?. ..

Here's an idea: build a survey with code snippets paired with potential results. Ask the user what they think they do. Let the results guide you. If it simply gets too confusing to use in any scenario (ex: ?. ..), then let's consider something else.

bean5 commented 5 years ago

Maybe ?/. or /^$/ (not quite correct regex, but understandable by users of regex). Admittedly, that is too cumbersome and long to type. Maybe ?$ or ?^ in memory of it. You could take it a step further and have one assert non-null!

bean5 commented 5 years ago

How about a superset symbol? It implies that the left is a superset of the right. Empty anything isn't really a superset of anything, so it would work, right? :

myVar⊇[index]. Since it isn't on most keyboards, you'd want a 2-character equivalent: myVar=> (a bit confounding with >=. Maybe one of these would work ?> or ?>=. Its starting to look like garble, and like javascript (ex: ===). While I'm here, I might as well try to exhaust the search result space: !?0, ?|. ?%.

munificent commented 5 years ago

I'd say it doesn't mirror this. Call syntax is foo.bar(). Making it null-aware adds just the question mark after foo, so mirroring this would be foo?[bar].

Fair point. What I had in mind is that it mirrors treating [] as another kind of null-aware method call. Null-aware method calls start with ?., so doing ?.[ would match that. We don't currently support calling operators that desugar to method calls using method call syntax like Scala does. In Scala, you can write a + b or a.+(b) and they mean the same thing. We've discussed supporting that in Dart. (Idiomatic code would use the normal infix syntax, but this notation can be handy for things like tear-offs, or embedding an operator in the middle of a method chain.)

The idea here is that if we were to do that, then using a?.[b] for the null-aware subscript call would match a.[b] for the unsugared notation for calling the subscript.

foo()?.[4]?.[5] Here, it's more clearly a method chain.

But it is not a method chain, why should it look like one? bar[1][4] doesn't look like a method chain either.

It is a method chain. The [] operator in Dart is just another kind of method call syntax. This is important because null-aware operators will short-circuit the rest of a method chain, so it's important for a reader to easily be able to tell what the rest of the method chain is so they understand how much code can be short-circuited.

This syntax parses without any ambiguity, whether we require ?. to be one token or two. (We have not decided on that,

Are you sure about that? If I run:

main() {
  String foo = null;
  print(foo ? . length);
}

I get compile errors.

it might be useful to make it one, but it may also disallow some formattings that others might want to do, like have x? on one line and .foo() on the next).

The formatter already handles splitting on null-aware method chains and it keeps ?. together. (It basically has to since the ?. is a single token in the analyzer AST. I'd have to do a lot of work to allow splitting it.)

If a user knows that foo.bar() can be made null-aware by adding a question mark after the possible-null-expression to make it foo?.bar(), their first try for foo() and foo[4] will be foo?() and foo?[4].

Yeah, unfortunately using foo?.[bar] means we don't have a rule that simple. We're sort of stuck with the history of already having a ternary operator and the ambiguity that that causes. We have to route around that by having a less regular syntax for null-aware subscript operators.

morisk commented 5 years ago

If we must use additional un-ambiguity characters maybe we can consider ?? instead of ?.. a = foo??[index] b = myFunc??()

lrhn commented 5 years ago

@morisk The ?? operator already exists in Dart and foo??[index] is already a valid expression (with a list literal as second operand). Using it for null-aware indexing would be a breaking change.

"All the good syntaxes are taken!"

Cat-sushi commented 5 years ago

I've come from the article in Medium, but I'm not convinced by the reasoning in the article, because,

.+ is currently illegal, and the introduction of .+ or ?.+ looks unnatural, so the abrupt introduction of ?.+ doesn't justify the introduction of ?.[ or ?.(, well.
In contrast, ?. and ?.. don't introduce additional dot, so those can't be good reasons to introduce ?.[ or ?.( with additional dot, either.
a[1][2][3] is already a method chain intentionally omitting dots, so a?[1]?[2]?[3] must be acceptable regardless of method chains.

With respect to the formatting result for a?[e1]:e2, it doesn't matter when a formatted non-NNBD code a ? [e1] : e2 is being migrated, providing that ?[ is space sensitive. In addition, it seems acceptable that a new NNBD code is formatted to a?[e1]: e2, because it is human readable enough to be checked, and syntactical/ type checking reveals most of human errors, whether in braces or not.

Consequently, I prefer ?[, ?( and ?+ much more than ?.[, ?.( and ?.+. To avoid the ambiguity, I think ?[ and ?( should be inseparable tokens, but I'm not sure the correct solution, anyway.

Cat-sushi commented 5 years ago

Self-commenting,

I think ?[ and ?( should be inseparable tokens, but I'm not sure the correct solution

Tokens such as --, ++, &&, ||, !=, == ... are always space sensitive, so ?[ as a space sensitive token is unexceptional to be acceptable. If ?[ is a token, then a?[e1]:e2 is not ambiguous at all. It is just a little confusing for persons who have particular mental model, but the confusion might be mitigated by the formatter.

Having said that, it could be a problem for the formatter, when the formatter have to treat both of NNBD code and non-NNBD code. Even if so, I hope the temporal problem doesn't restrict the future of the language. I mean the formatter should ignore unformatted conditional expressions especially as members of set literals which might be broken silently with a NNBD version of dart.

To assess the impact of the ignorance, I'd like to know a statistic how many codes have unformatted conditional expressions as members of set literals. I hope the number is not so big, because set literal is a quite new feature and the formatter is widely used.

n8crwlr commented 5 years ago

I would like to go with e1?.[e2]

It's handy enough, looks more concise to me and architectural more correct as the alternative. It has more technical correctness, as there are several good arguments for that. The argument of the alternative is, on average, that the arguments of the former are not overriding arguments.

Going for NNBD should give a clear and concise syntax, which is e1?.[e2] in my eyes.

Cat-sushi commented 5 years ago

@n8crwlr What do "concise" and "clear" mean? As a fact, ?[ is shorter than ?.[, and ?[ resembles [ better than ?.[ does.

jodinathan commented 5 years ago

When I fast look to my?.['foo']?.['bar'] I automatically think that there is some default function being called after the dot or something like that. I would try hard to keep the ?[ syntax.

n8crwlr commented 5 years ago

@Cat-sushi These are not points for me. As written, ?.[ is handy enough and very concise, (especially in NNBD code) so just saving a . is a belief and not a necessity. We are talking about nullable types in not nullable by default.

If i look fast into my code, seeing my?.['foo']?.['bar'], yes, i would like to think that's calling on nullable type. I really do not feel better with my?['not']?['functions'] - that's start of nested ternary? The alternate is ok, but it is not 'yeah, good decission guys!' At least, say these are functions is very far fetched for me.

Cat-sushi commented 5 years ago

I've understood the point of @n8crwlr is

I really do not feel better with my?['not']?['functions'] - that's start of nested ternary?

As I said, it is little confusing for persons who have particular mental model, but the formatter might mitigate the problem, because the formatter always puts spaces around ? and : of ternary expression. In addition, the first operand of ternary operator should be a boolean expression, on the other hand, in most case, the fist operand of subscription is a identifier who's name stands for List and is not suspicious to be a boolean expression.

More importantly, I guess that most persons don't have such mental model that they have to put additional dot just after null aware mark ?, because there already are null aware operators such as ?. (method invocation) and ?... (spreading), which don't introduce additional dots.

n8crwlr commented 5 years ago

@Cat-sushi I am not talking about the tools. I am talking about me. I like ?.[. It's far easier to read and easier to spot with the eye even in large sources.

I am not sure where your problem is if you do not like it. Dart will change to NNBD, what means, only those types in your code are affected, which must be nullable in your implementation.

Cat-sushi commented 5 years ago

I knew ?.[ is easier to read for you, but I guess ?[ is easier to write and read for most persons including me.

I couldn't understand what you want to say with the second paragraph of your last comment. I knew NNBD well.

morisk commented 5 years ago

I might miss it, how does simple nullable work? ex final a = foo?baz if yes, then how foo?baz?.[42]?bar is not looking more confusing or meaningless or language not consistent?

Cat-sushi commented 5 years ago

@morisk Sorry, I couldn't understand your question. What final a = foo?baz and foo?baz?.[42]?bar mean? Could you write non-NNBD versions of those?

morisk commented 5 years ago

@Cat-sushi

@n8crwlr wrote:

?.[ is handy enough and very concise, (especially in NNBD code)

There is a description Issue-155 to write nullable chainning using ?. therefor using ?.[ is indeed more concise with the NNDB.

Personally i'd prefer ? chainning like in other languages I use.

Cat-sushi commented 5 years ago

@morisk I think we have reached the consensus that c?.m1().?m2() is the future of the method chain, where m1 and m2 are regular methods. I also agreed with ?. which doesn't introduce additional dot in comparison with c.m1().m2().

We are now talking about special case of method chain with [] such as a[e1][e2], and the point at issue is which is better a?[e1]?[e2] or a?.[e1]?.[e2].

I prefer a?[e1]?[e2] as you do.

morisk commented 5 years ago

edited

morisk commented 5 years ago

@Cat-sushi

I prefer a?[e1]?[e2] as you do.

Yep.

n8crwlr commented 5 years ago

?.[ is handy enough and very concise, (especially in NNBD code)

I am not native english speaker. Concise in my meaning says: clear to read / strong visibility.

A dot is one of most used operators, there is no problem to type it in some nullable types. As said, it is handy enough for me.

rrousselGit commented 5 years ago

?.[ is rather ugly IMO, but likely unavoidable.

But what about an alternate solution instead:

Currently, we almost always want foo?.bar?.baz instead of foo?.bar.baz. So what about having syntax sugar for the former, which would implicitly solve this issue?

Instead of foo?.bar?.baz, we could write ?.foo.bar.baz. Which means that instead of foo?.[0].baz we'd have ?.foo[0].baz

Cat-sushi commented 5 years ago

@rrousselGit issue-155 discusses that with respect to o?.m1()?.m2(), the second ? is omittable providing that o?.m1() is null if and only if o is null, in other words, return type of m1() is not nullable. Your proposal is independent from issue-155 or this issue.

maksimr commented 5 years ago

I prefer ?[ over ?.[ because:

Consistency. One common syntax for one meaning. foo?.bar and foo?[1]
Familiarity. Programmers familiar with other language can learn dart faster

All these talks about "ambiguity" make sense for internal stuff. But I, as a user of language, want to know less about internal decisions which affect public API and concentrate on solving my domain/app problems.

Currently all arguments toward ?.[ look more like excuses. I personaly don't think that the case when you write something like { a?[b] : [c]} is so popular that should be solved on language level. I would be more tolerant to this syntax if I saw performance problems rather than problem with set or map.

However, I highly appreciate your openness to the community, work and other tehnical decisions so I'm totaly could live with solution ?.[ :)

Thanks!

munificent commented 5 years ago

All these talks about "ambiguity" make sense for internal stuff. But I, as a user of language, want to know less about internal decisions which affect public API and concentrate on solving my domain/app problems.

This is a fair goal, but the point is that the ambiguity isn't something that is "internal" to the implementation of the Dart language. The grammar is the "public API" of the language, so if it's ambiguous, it's broken.

I can try to explain in terms of an analogy. Imagine you maintain some library package "foo" that has lots of users. You want to add a new public function to it called doStuff(). Unfortunately, "foo" already has a different public function with that same name. You can't simply say "well, the existence of that old function is an internal implementation detail of foo". It's part of the public API. And, very concretely, you can't simply have two functions with the same name. The tools can't handle it.

Grammar ambiguities are the same way. We simply can't say that one single piece of syntax means two different things at the same time. The parser has to do something and whatever it picks changes how the user's program behaves. This is a user-facing choice.

Cat-sushi commented 5 years ago

@munificent Generally speaking, I totally agree with you, but we would like to have a practical discussion. I would like to have statistics which shows how much programs have such code as {a?[e1]:[e2]}. Also, I would like to know options to avoid the ambiguity other than ?.[.

munificent commented 5 years ago

I would like to have statistics which shows how much programs have such code as {a?[e1]:[e2]}.

We don't have numbers, but my hunch is the numbers are very low. But for a grammar ambiguity, that doesn't really matter. We simply can't have an ambiguity. The parser must choose one way or the other, so any ambiguity must be resolved even if never encountered in real world code.

I would like to know options to avoid the ambiguity other than ?.[.

The very top of the issue discusses this some. The other option we looked at was making whitespace sensitive such that ?[ and ? [ are treated differently by the tokenizer. That resolves the ambiguity by saying that if you intend {a?[e1]:2} to be a set, you must put a space after the ? and if you want it to be a map, you cannot. But this also means that whitespace would be meaningful everywhere ?[ is used, even in unambiguous cases.

That in turn makes the language more error-prone for users and code generators that are producing Dart without wanting to be careful about whitespace. A very common programming style in Dart is to just pound out a bunch of code without regard to whitespace, and then run dartfmt and let is sort it out. That style becomes riskier every time we make the language more dependent on whitespace for parsing a program.

Likewise, many Dart code generators produce code without worrying about whitespace and let dartfmt fix it up. Those code generators will need to be written more carefully if a space between ? and ? is meaningful.

The alternative syntax, ?.[ doesn't have these problems. The only thing it really has against it is that some users don't like the way it looks. That matters, of course, but given that some users do like the way it looks, that isn't an obvious deciding factor.

maksimr commented 5 years ago

@munificent

The only thing it really has against it is that some users don't like the way it looks.

I don't think that the problem in how it looks but which mental model it creates. When I see foo?.bar and foo?[bar] I treat ? as a separate operator but which works with conjunction with access operators. Here I should remember only one operator "?" and the rule that I can use it with access operators. But when I see foo?.bar and foo?.[bar] I should remember two operators which logically behave identically. Yes, I still should remember two things but now these two things not general and not composable as in the previous example because in the previous example I had only one operator and rule which allows to use it with any access operators.

so instead to remember: ? - null-aware . - object property access [ - array item access

I should remember: . - object property access ?. - null-aware object property access [ - array item access ?.[ - null-aware array item access

munificent commented 5 years ago

Yes, I still should remember two things but now these two things not general and not composable as in the previous example

Yeah, I agree completely. It's not as simple, regular, and composable as it could be.

At the same time, if we wanted complete regularity, we'd probably eliminate [ ] entirely. It's just another syntax for calling a specially-named method. For example, Scala and Smalltalk don't have any special syntax for subscripting.

When you're designing a syntax, one of the key goals is to minimize the amount of new things a user has to learn before they can be productive. There are two main approaches:

Minimize the number of concepts in the language by making things simple, orthogonal, and composable.
Take advantage of things the user already learned by following an existing language they already know.

One of the primary challenges of language design is that these two points are often in conflict because the languages many users already know are themselves not simple, orthogonal, and composable. Dart's syntax is based on JS and Java, which are in turn based on C++, which is based on C. There's a lot of weird historical baggage and irregularity in there. But it's baggage millions of users have already internalized so Dart is easier for them to learn if we adopt it.

Also, we're shackled to our own history. It's very hard for us to making breaking changes to the language. So new features sometimes don't slot in as gracefully as we'd like because we don't have the luxury of pushing around existing features to make room for them.

The end result is that syntax design is often a matter of making trade-offs. In this case, using ?.[ is a little special and irregular. It's another thing a user has to learn. But, it's probably better than them having to learn that ?[ and ? [ mean two entirely different things to the language. And, if we ever allow other operators to be called with . syntax like Scala does (1.+(2)), then this irregular subscript syntax will become more regular because then it will follow the other operators.

bean5 commented 5 years ago

I agree with a lot of that. I don't think you should copy other languages just be more learnable, unless you take the good from various languages. Just copying one defeats the purpose of having another language and just makes yet another language.

Also, you mention regularity. Do you mean this in terms of language regularity or being a "regular language" (English is not)? I never learned that a regular language can become more regular, but I suppose Dart is not 100% regular because of [ ]. So it makes sense that it can become more regular. It is a good point. I just always thought of languages as either regular or not rather than on a continuum like entropy is.

Cat-sushi commented 5 years ago

@munificent

That resolves the ambiguity by saying that if you intend {a?[e1]:2} to be a set, you must put a space after the ? and if you want it to be a map, you cannot. But this also means that whitespace would be meaningful everywhere ?[ is used, even in unambiguous cases.

I think it's not a problem for most persons.

That in turn makes the language more error-prone for users and code generators that are producing Dart without wanting to be careful about whitespace.

I don't think so because of 3 reasons.

{a ? [e1] : [e2]} is a very rare case.
a?[e1]:[e2] outside of a Set literal is marked as error by the analyzer.
we usually, and the current version of the formatter always put a space just after ? of ternary operator.

A very common programming style in Dart is to just pound out a bunch of code without regard to whitespace, and then run dartfmt and let is sort it out.

Yes, I agree to it.

At the same time, if we wanted complete regularity, we'd probably eliminate [ ] entirely. It's just another syntax for calling a specially-named method. For example, Scala and Smalltalk don't have any special syntax for subscripting.

Isn't It an extreme argument? [ ] is super popular from C lang to Dart, in the latter it's just a syntax sugar of method invocation.

There's a lot of weird historical baggage and irregularity in there. But it's baggage millions of users have already internalized so Dart is easier for them to learn if we adopt it.

The end result is that syntax design is often a matter of making trade-offs.

I understood this is the reason why this issue and the article are posted, and I would like to vote to ?[.

dart-lang / language

Should null-aware subscripting use `?[` or `?.[` syntax? #376