Proposal: Disallow apostrophes in names

rtfeldman commented 9 years ago

Proposed Change

Using ' in a variable or function name becomes a syntax error.

(Shockingly, this would be a breaking change.)

Motivation

A common convention in Elm, presumably inherited from Haskell, is that if you want an alternate version of something called foo, a quick way to avoid coming up with a new name is to call it foo'.

Arguments for removing it:

1. It confuses newbies, because ' is not allowed in most languages' variable or function names, so they assume it is a sigil with some special meaning they do not know.

Elm newbies have asked me variations of this StackOverflow question on several occasions. More specifically, every Elm newbie I can recall who saw elm-html's type' attribute has asked me what the ' meant.

If they don't have someone like me to ask, they will sidetrack their introduction to Elm in order to go off and Google an un-Google-friendly term in hopes of getting unstuck.

2. Being about two pixels in size, it is much easier to miss than other single-character suffixes such as _ or 2 (e.g. foo_ or foo2), while being no more concise.

One could argue "if that's a problem for you, don't use it," but in reality we all have to interact with others' code. The fact that it is allowed (and de facto encouraged by convention) means it still impacts those of us who choose not to write it ourselves.

3. The migration from ' to _ (for example) requires little more than running the following regexp: s/([\s^]\s*)([^'\s]+)'/\1\2_/g (and you can run this again to change '' to __ etc), so it's not like it would be terribly time-consuming to convert.

texastoland commented 9 years ago

Et tu, Brute? :japanese_goblin:

rtfeldman commented 9 years ago

Hey, I've used it in the past too, but I'm coming around to the idea that it's more bad than good.

Curious what other people think, so I figured I'd write it up and find out. :smiley:

Apanatshka commented 9 years ago

:+1: Totally agree. (For a change! I don't mean to be against all your proposals ;P )

rtfeldman commented 9 years ago

@Apanatshka Honestly I 100% appreciate all your comments, for or against. I seek the best outcomes, and you consistently contribute well-reasoned points! :smiley_cat:

texastoland commented 9 years ago

I'm vaguely against this. I agree that they're hard to differentiate at a glance but:

It's only mentioned on the mailing list one time to my knowledge.
' carries semantic meaning I learned in elementary math (i.e. not category theory). That's actually how I use it: to convey the new value of something resulting from a transformation.
Superscript draws stronger visual correlation to the name it's related to.
I personally hate naming things. It's one of my least favorite things about programming. as, bs, xss, and primes are something I welcomed when exposed to Haskell. If there's an alternative I'd prefer it match the regex /\W$/.
Trailing _ had a curious effect of not looking distinctive from the next function argument blending into a single word e.g. f_ acc.

That said if it's a common pain point reading code or scaring away our target demographic then it may be sensible. Here's the most ridiculous example from real code I wrote last night:

collect : (a -> b -> b) -> b -> List' a -> List' b
collect f acc xs = let
  f' x (y, ys) = let
    y' = f x y
    in (y', y' `append` ys)
  acc' = (acc, (acc `append` empty))
  in reduceLeft f' acc' xs
    |> snd
    |> reverse

rtfeldman commented 9 years ago

Worth noting that TodoMVC has used _ for this purpose going as far back as 2014.

mgold commented 9 years ago

I'm also "vaguely against this", but I'm willing to be persuaded. I mostly agree with Texas's points, and will confirm that I'm comfortable using x' to mean "like x, but slightly different or transformed".

I think a very key use of the prime is with Random.seed, since you're always working with at least two seeds and maybe many more. In fact, the docs use the prime convention. Elm's let does not have Scheme's let* semantics, meaning that you can't rebind names twice in the same let. (I think this is probably a good thing, since it reduces ambiguity and can catch bugs.) So you need some way of distinguishing between all those seeds.

Compare:

if foo'' == foo'''
if foo__ == foo___

It's much easier to count the number of primes than it is to measure the length of the combined underscore.

rtfeldman commented 9 years ago

What's wrong with foo, foo2, foo3 as a substitute? (It's even fewer characters!)

On Mon, Jul 13, 2015, 6:45 AM Max Goldstein notifications@github.com wrote:

I'm also "vaguely against this", but I'm willing to be persuaded. I mostly agree with Texas's points, and will confirm that I'm comfortable using x' to mean "like x, but slightly different or transformed". I think a very key use of the prime is with Random.seed, since you're always working with at least two seeds and maybe many more. Elm's let does not have Scheme's let* semantics, meaning that you can't rebind names twice in the same let. (I think this is probably a good thing, since it reduces ambiguity and can catch bugs.) So you need some way of distinguishing between all those seeds.

Compare:

if foo'' == foo'''if foo__ == foo___

It's much easier to count the number of primes than it is to measure the length of the combined underscore.

— Reply to this email directly or view it on GitHub https://github.com/elm-lang/elm-plans/issues/4#issuecomment-120932763.

mgold commented 9 years ago

That's true...

evancz commented 9 years ago

Code examples!

What does the Random library look like with this change? Do you have code examples from your codebase that'd change based on this?

One counter argument to this is that "people are gonna find a way to write bad code". I played with the idea of a "syntax tax" where I made certain things ugly to discourage their use. Specifically import List exposing (..) used to be import open List which means things no longer aligned nicely if you had an import that was open. What ended up happening was that people wrote it anyway and Elm was just uglier.

What I'm getting at is, if people are instead writing x1 and x2 or x_ and x__, are we actually in a better world? Perhaps the right thing is to have a culture that says "name your variables for real!"

I think code examples will help us figure this out!

rtfeldman commented 9 years ago

What I'm getting at is, if people are instead writing x1 and x2 or x_ and x__, are we actually in a better world?

Yes, in that newbies from the JS world know that x1 and x2 and x_ and x__ are just variable names, whereas they often confuse x' and x'' for an unfamiliar special language feature of some sort.

(Which is totally reasonable; in both Ruby and CoffeeScript, for example, foo is a variable, but foo: and @foo do completely different things, which is to say nothing of C's &foo and *foo, Perl's $foo and %foo, various Lisps' 'foo - which has an apostrophe, but does not even involve a variable - CoffeeScript's foo?, or various languages where :foo is a Symbol. Sigils are a sane thing to watch out for when learning a new language!)

rtfeldman commented 9 years ago

Some elm-test code rewritten without ': (before), (after)

Random's example rewritten without ': (before), (after)

Some Dreamwriter code rewritten without ': (before), (after)

mgold commented 9 years ago

I think the only time this might be a pain is if you're dealing with geometry and x1 and x2 are conceptually distinct, not variations on the same value.

At this point I'm probably slightly in favor, although if you have more examples of confused newbies than that one SO post, I'd love to see 'em.

rtfeldman commented 9 years ago

Yeah they've mostly been in person on this one. For what it's worth, you could always do like x2_ or newX2 in the coordinate cases.

bbugh commented 9 years ago

Strong :+1:. I think Elm will benefit a lot from removing edge case oddities in the syntax, which is one of the biggest sticking points for people when learning.

I think it's important to measure how readable code is from the perspective of a junior developer, not the CTO. Any development team leader has to make decisions to adopt languages based on three things: 1) does it solve my problem better than the other options? 2) can I find people who know it? 3) can I easily train people who don't? All three are important to answer with a resounding "yes" if Elm wants to be broadly adopted, which would be great imho. :sunglasses:

I think Elm will be easier for people to learn if it has consistent, unsurprising syntax. I would also suggest that Elm standardizes around contextually meaningful names (positionSeed, completedItems) over generic terse Haskell-like ones (seed', items_).

rtfeldman commented 9 years ago

@bbugh Thanks for sharing! I'm curious whether you've had a similar experience to mine of seeing newcomers mistaking ' for a special language feature.

mgold commented 9 years ago

Now that you bring it up, I remember some R code (an incomplete program provided by the prof) in college that had a variable name with a dot. I assumed it was a language feature but apparently it was just part of the variable. I only remember this story because I my non-CS housemate TA'd the class a year later, and to help her understand to variable, I found my copy of the assignment, and I had renamed the variable to something sensible. (I urged her to change the code provided to students but she said she couldn't.)

Annnnnyway, weird characters in variable names can be confusing. Much as I dislike losing a nice feature for advanced users, this is the way to go. :+1:

kmarekspartz commented 9 years ago

While . is just part of a variable in R, $ is not! I think we can agree not to look at R as an example of programming language syntax to replicate.

texastoland commented 9 years ago

The reasoning given would be the same for disallowing _ though. Naming is hard and important for API and examples. There needs to be some convention for internal implementation though. Difficult naming has discouraged me from breaking down complex/complicated/convoluted expressions in other languages. I.e. it's trading one problem for another.

evancz commented 9 years ago

I wrote up some guidelines on have productive discussions in github things here, with some specific notes about design discussions and specifically about syntax stuff.

Besides contributing a coherent set of examples of weird characters used in variable names in a broad collection of languages, I don't think there's much more to say here. This is blocked on me making the final decision, so I think it makes sense to focus on other stuff for now.

rtfeldman commented 9 years ago

:+1: Seems like a good policy!

evancz commented 9 years ago

I'm closing this repo down. This idea is interesting. I will remember it. It is not time to consider changes like this right now, so I will bring it up when I want feedback on it.

jamonholmgren commented 7 years ago

I don't mean to necro this thread, but it's worth noting that I googled elm single quote after variable name and this came up as the second result (after a string interpolation proposal, which shows this can be hard to google). I indeed thought it was some sort of special sigil when I saw it.

mgold commented 7 years ago

For those who find the thread from Google: apostrophes are not longer allowed in variable names in 0.18. You can use model_ if you need to, but we're trying to guide you to newModel or another better name.

zzz6519003 commented 7 years ago

why would one use ' in a variable :beer:

andys8 commented 5 years ago

I would prefer if ticks were still allowed

davidmason commented 4 years ago

My main complaint is it makes the pirate's code less piratey. https://gist.github.com/davidmason/336b17dfd91faee18147

(@zzz6519003 this is the strongest use-case I can come up with)

elm-lang / elm-plans

Proposal: Disallow apostrophes in names #4

Proposed Change

Motivation