scala / scala-dev

Scala 2 team issues. Not for user-facing bugs or directly actionable user-facing improvements. For build/test/infra and for longer-term planning and idea tracking. Our bug tracker is at https://github.com/scala/bug/issues
Apache License 2.0
130 stars 14 forks source link

Drop Symbol literals #459

Closed odersky closed 4 years ago

odersky commented 6 years ago

In the spirit of dropping things, what do people think of dropping symbol literals? Saving one character between 'myName and "myName" in some rare cases seems not worth the syntactic complexity.

Sciss commented 6 years ago

I think that would be great; I never liked the syntax, especially as the ticks are also used for characters 'x', so it always looks like an unclosed character literal.

densh commented 6 years ago

If symbol literals are dropped, the whole scala.Symbol data type should probably be dropped as well. The only value of symbols is their marginally shorter syntax.

dwijnand commented 6 years ago

Perhaps they can be dropped when http://docs.scala-lang.org/sips/42.type.html lands?

ritschwumm commented 6 years ago

drop them, please. i never needed them in years, and was annoyed more than once by some editor's syntax highlighting becoming confused.

odersky commented 6 years ago

If symbol literals are dropped, the whole scala.Symbol data type should probably be dropped as well. The only value of symbols is their marginally shorter syntax.

Yes, but for the time being we'd need Symbol as a rewrite target. But it could be deprecated.

SethTisue commented 6 years ago

Yes please, let's drop this!

milessabin commented 6 years ago

I'd be happy to trade them for Byte and Short literals.

sjrd commented 6 years ago

We really need to start thinking about what we gain for all the code out there that we are going to break. To me it seems that in this case we gain only a bit of code in the parser, and I don't think this justifies all the breakage.

Sciss commented 6 years ago

Do we have a statistic about Symbol usage - because I bet that very very few people actually use it. And I thought the idea was anyway to have a rewriting tool. I think the gain is a significant simplification of the language, dropping a totally unnecessary feature; unless you embody symbols with some new power that distinguishes them from strings. To me it's like the proposal (on discourse) to introduce the let keyword - you end up with let versus val with no real gain except having to think twice what to use.

milessabin commented 6 years ago

shapeless's LabelledGeneric uses Symbol and an encoding of their singleton types, but with the benefit of hindsight that was a terrible mistake ... I should just have used String.

In the meantime, I think that first-class syntax should go hand in hand with first class singleton types, so we should have both (which is proposed in SIP-23) or neither (proposed in this issue). I think either of those options is preferable to the status quo.

olafurpg commented 6 years ago

Do we have a statistic about Symbol usage - because I bet that very very few people actually use it.

At the top of my head, Play Framework forms use symbol literals quite a bit, see examples in https://www.playframework.com/documentation/2.6.x/ScalaForms. Ammonite ops is another big user http://ammonite.io/#grep I can crunch numbers on my corpus, but I suspect symbol literals are more frequently used than you claim.

And I thought the idea was anyway to have a rewriting tool.

Play forms symbol literals are likely to appear in external DSLs files like Twirl templates that make automatic migration tricker. Same applies for ammonite scripts.

odersky commented 6 years ago

We really need to start thinking about what we gain for all the code out there that we are going to break. To me it seems that in this case we gain only a bit of code in the parser, and I don't think this justifies all the breakage.

It's actually much more than the parser, since symbol literals are constants, which makes them special in several respects. One aspect it SIP 23, as mentioned by @milessabin. I think it presents a certain awkwardness that all the essential infrastructure for strings has to be duplicated for symbols.

But I believe that's not the most significant cost either. That would be the necessity for us to keep teaching the concept and for programmers the risk of being puzzled when they see it.

smarter commented 6 years ago

To be able to deprecate Symbol we'll need to change @deprecatedName to take a String instead of a Symbol. Unfortunately I don't see how we can do this without breaking existing usages since annotations cannot have multiple constructors. Anyone sees a migration path here?

sjrd commented 6 years ago

Annotations can have multiple constructors, e.g., https://github.com/scala-js/scala-js/blob/master/library/src/main/scala/scala/scalajs/js/annotation/JSExport.scala

kenbot commented 6 years ago

Do it! They sort of make sense as a poor man's enum in Lisps/dynlangs, but don't really have a purpose in Scala other than tickling the imaginations of irresponsible DSL authors.

janekdb commented 6 years ago

I'd miss this when writing ScalaTest assertions where @bvenners has been responsible in his DSL imaginings! I love this style of assertion,

book should have (
  'title ("Programming in Scala"),
  'author (List("Odersky", "Spoon", "Venners")),
  'pubYear (2008)
)

http://www.scalatest.org/user_guide/using_matchers

Sciss commented 6 years ago

@janekdb you could easily use "title" ("Programming in Scala"). Or actually type-safe:

book should have (_.title === "Porgramming Scala", _.author === List(...))

etc. I don't see this as a strong case for having to have symbol literals. It just confirms Ken's comment, it's tickling the imaginations of DSL authors...

janekdb commented 6 years ago

Yes "title" could be used but I would not welcome that. _.title has the disadvantage that field needs to exist before the test is run but we digress...

som-snytt commented 6 years ago

By the law of conservation of syntax, now we can have name' identifiers.

def f(x: X) = { val x' = x.next() ; g(x) }

where x prime actually shadows its x zero and warns if it shadows nothing.

Also that implicit val _ = ??? syntax folks crave could be expressed

implicit val _': X = ???
implicit val _': Y = ???
bvenners commented 6 years ago

ScalaTest Matchers has always offered both a type safe way and a dynamic way to check properties with have:

book should have ('title ("Moby Dick")) // dynamic: uses reflection
book should have (title ("Moby Dick"))  // type safe: only works on Books; no reflection used

To take the type safe route you need to write HavePropertyMatchers:

http://doc.scalatest.org/3.0.1/#org.scalatest.matchers.HavePropertyMatcher

Thus the tradeoff was to get the type safety, you need to write more code. In production code I would want the type safety, but in tests I felt it was reasonable to let users take the dynamic route, because if they screw up the symbol name, the enclosing test would promptly fail.

It was a bit of a social experiment to see which way users went, and although I never did an actual scientific survey, my observation has been that by far most of of the time users just use the tick mark and don't bother with writing HavePropertyMatchers.

So I suspect dropping Symbol literals would break a lot of existing ScalaTest user code.

I would suggest you not drop the Symbol type, because it does offer something that String does not. It checks at compile time that the String is a valid Scala identifier. I'd suggest if you drop Symbol literals to add a standard sym or ident String interpolator that ensures at compile time the String is a valid Scala identifier. Then the rewrite would be from:

book should have (
  'title ("Programming in Scala"),
  'author (List("Odersky", "Spoon", "Venners")),
  'pubYear (2008)
)

to:

book should have (
  sym"title" ("Programming in Scala"),
  sym"author" (List("Odersky", "Spoon", "Venners")),
  sym"pubYear" (2008)
)

Not quite as pretty, but easy to rewrite automatically. sym"title" would need to produce a Symbol to keep this syntax working, not just a String, because String has an apply method whereas Symbol does not. And this syntax uses apply:

'title ("Programming in Scala")

is

('title).apply("Programming in Scala")
nafg commented 6 years ago

Then you'd have to rewrite the apply call to. Why not just use a Map[String, Any], or a varargs of (String, Any)? I think -> is slightly more idiomatic than using apply this way anyway.

milessabin commented 6 years ago

@bvenners

It checks at compile time that the String is a valid Scala identifier

Except that it doesn't ... that was the mistake that I also made when I used Symbols to represent labels in shapeless's LabelledGeneric ... many backticked identifiers can't be expressed as literals, eg.,

val `foo bar` = 23

Of course, you can construct such Symbols (new Symbol("foo bar")) but in that case all the syntactic convenience has been lost and you might as well just use strings.

mblaze commented 6 years ago

I have to protest. One could create an apply extension method for a Symbol. The same cannot be done for a String since it already has an apply method. This would be very unfortunate for DSLs.

With symbol literals you could build a Prolog like DSL with typed logic variables 'X[Int] and Prolog functors 'fun('X[Int], 'Y[String]). This is simply not possible with String for the reason mentioned above.

So the statement from @odersky that the only gain from 'abcd compared to "abcd" is just one character is an oversimplification. If Symbol literals are to go then StringOps.apply should be removed too.

Ichoran commented 6 years ago

@matiki1231 - String interpolators don't have to return strings. It's not true that 'foo can be replaced by "foo" (one character more), but it could be replaced by y"foo" (two characters more). Whether that is worth it, as opposed to sym"foo" perhaps, depends on how often symbols are needed.

ritschwumm commented 6 years ago

@matiki1231 or y.foo using scala.Dynamic - with one additional character

Ichoran commented 6 years ago

@ritschwumm - Using a common variable name like y is a really bad idea. It's okay with string interpolators because it has to be followed by ". Just having y in scope everywhere--with any method on it valid!--sounds like a recipe for error to me.

You could use scala.Dynamic, but you'd want a more discriminable name (e.g. Sym.foo seems safe enough to me).

mblaze commented 6 years ago

@Ichoran I know about interpolators, but this becomes really ugly. From almost an identifier 'foo we get to a mess like v"foo".

@ritschwumm I'm not so fond of scala.Dynamic as a feature of the language. It can get confusing pretty quickly. Still requires an ugly prefix which you need to give a sensible name.

After all we create DSLs to better describe our domain specific problems. We strive to eliminate syntactical overhead and in many ways Scala is designed to help us achieve that. After all you could write DSLs in Java. The only problem is that the resulting description would be 50% Java specific overhead.

I must still insist that 'fun('X[Int], 'Y[String]) or something alike would get unreadable with the solutions you proposed.

What is more, with symbol literals we do not risk any name clashes as is the case with v.foo or v"foo".

@Ichoran the longer the name of the scala.Dynamic instance, the more noise is introduced.

milessabin commented 6 years ago

I must still insist that 'fun('X[Int], 'Y[String]) or something alike would get unreadable with the solutions you proposed.

Agreed, but there are other approaches possible than using Symbols here (eg. higher order abstract syntax). Wouldn't it be even better to be able to write, fun(X[Int], Y[String])?

mblaze commented 6 years ago

@milessabin Oh it would! Please bestow on me the knowledge of arcane programming language techniques that allow for such a beauty!

EDIT: Reading about it right now. Not exactly sure how such a beast of a feature could be implemented in Scala.

milessabin commented 6 years ago

Google Scholar is quite helpful here. The first hit is a paper from 2008 with @adriaanm as one of the authors.

I think that the ideas described there should work a lot more smoothly with current Scala than they did with Scala circa 2008, and it'd be great if Dotty^wScala 3 made them smoother still.

mblaze commented 6 years ago

Wouldn't IDE support be difficult?

milessabin commented 6 years ago

Wouldn't IDE support be difficult?

I don't see why ... the principal idea of HOAS is to reuse the host language's name binding mechanisms. So if your IDE understands Scala name binding correctly then there shouldn't be a problem.

adriaanm commented 6 years ago

Tracking the work to deprecate in 2.13 here: https://github.com/scala/scala-dev/issues/574

Please keep the discussion / voting here. I note the migration burden on ScalaTest users, but it would really simplify things for cross-building of Scala 2 / 3 using TASTY if we could drop them (see Martin's comments above).

SethTisue commented 6 years ago

Yes please, let's drop this!

I second my past self on this. Even taking the comments since into account.

Yes, there exist some fairly niche use cases where having symbol literals makes something a little nicer. It isn't sufficient justification to bake something into a language's lexical syntax, it doesn't carry its weight.

As for breaking some existing code,

1) symbol literals aren't pervasively used in the ecosystem, it's relatively obscure corner of the language even after a decade plus 2) for the people who have been using them, the inconvenience level in this case is minor, it's a little search-and-replace or a little Scalafixing 3) deprecating in 2.13 gives people plenty of time to transition before removal in 2.14

And as always with removals, a little pain for existing/past users, it's real, but it needs to be weighed against making the language easier to learn (and parse and make tooling for ...) for every future Scala user in the language's whole long future life.

ashawley commented 6 years ago

It sounds like the change necessary to drop symbol literals in the compiler isn't involved. Could an experimental PR be created that removes the symbol literal syntax in 2.13, but with no intention to merge? It could

  1. show what the new symbol syntax will be, and
  2. be a good discussion point for the change, but also
  3. be a good way to study what deprecation will need to be like.

It might be a lot of changes to the test suite, but I'd argue those would be the interesting bits to learn from. FWIW, the code in the parser/scanner probably isn't changed often, so the branch could probably be easily resuscitated for 2.14.

Removing things is often easier said than done, so it's probably beneficial to experiment with this proposed change rather than just discuss and vote.

SethTisue commented 6 years ago

Could an experimental PR be created that removes the symbol literal syntax in 2.13, but with no intention to merge?

and/or, the removal could be done, but only under -Xsource:2.14. that form would be mergeable, and then downstream projects would be able to experiment with the effects of the removal, too.

I appreciate the desire for caution here, but my gut feeling is that the experiment you propose isn't really necessary for two reasons:

1) as a community, we already have a decade-plus of experience with symbol literals 2) to the extent an experiment is needed here, the deprecation itself can be the experiment. if there is a big outcry and we decide we were wrong, the deprecation in 2.13.0-RC1 could be reversed in 2.13.0 final or even in a subsequent 2.13.x release

SethTisue commented 6 years ago

ah, I see that @eed3si9n is way ahead of me, his PR (scala/scala#7395) already does the removal under -Xsource:2.14

ashawley commented 6 years ago

the removal could be done, but only under -Xsource:2.14.

If that's possible, then yes that would also be beneficial for studying the consequences downstream.

the deprecation itself can be the experiment.

Perhaps, but I imagine that the change for deprecation is often different from actually removing, so the experiment of actual removal may be more telling.

joshlemer commented 6 years ago

I thought I'd just mention some big projects that use Symbols in their public api, just to show that while they aren't pervasive, people do find uses for them:

    pathEnd {
      (put | parameter('method ! "put")) {
        // form extraction from multipart or www-url-encoded forms
        formFields(('email, 'total.as[Money])).as(Order) { order =>
          complete {
            // complete with serialized Future result
            (myDbActor ? Update(order)).mapTo[TransactionResult]
          }
        }
      } ~
      get {
        // debugging helper
        logRequest("GET-ORDER") {
          // use in-scope marshaller to create completer function
          completeWith(instanceOf[Order]) { completer =>
            // custom
            processOrderRequest(orderId, completer)
          }
        }
      }
    } ~
    path("items") {
      get {
        // parameters to case class extraction
        parameters(('size.as[Int], 'color ?, 'dangerous ? "no"))
          .as(OrderItem) { orderItem =>
            // ... route using case class instance created from
            // required and optional query parameters
          }
      }
    }
SethTisue commented 5 years ago

@joshlemer fair, thanks for documenting that. there's also utest:

object HelloTests extends TestSuite{
  val tests = Tests{
    'test1 - {
      throw new Exception("test1")
    }
    ...
ashawley commented 5 years ago

Support for string parameter in deprecatedName was made in 2.13.0-M5: https://github.com/scala/scala/pull/6916

It probably needs to be ported to Scalajs.

lihaoyi commented 5 years ago

I use Symbol literals pretty heavily in uTest (as above) and OS-Lib (to construct filesystem paths wd / 'segment / 'nested). Both of those use cases can take strings, and while it would be a lot of code to change if we get rid of symbol literals, it would just be a regex replace and we're done.

+1 for me for killing the literals

jeffrey-aguilera commented 5 years ago

Other than "I don't like/don't use", I haven't seen one valid justification for removing symbol literals. Do they make the grammar ambiguous? Do they slow down compilation? If you don't use this feature, then continue to not use it ... but don't force that opinion of utility on others. This is just a terrible idea.

godenji commented 5 years ago

Have been using them extensively in Play web apps for years -- fortunately most of the affected code is generated.

I guess if there's significant gain wrt to simplifying the language then it's worth it, although I haven't seen anything in this thread that indicates the language feature is that burdensome.

At any rate, looks like the ship has sailed: https://github.com/lampepfl/dotty/pull/5681

som-snytt commented 5 years ago

Removing a feature is easier done than said. My attempt to summarize the long thread:

Symbol literals don't pull their weight because they are a thing to learn, but aren't really a thing. They're just interned Strings, so not a concept. Literal types are more interesting than comparing strings with eq. Also, they're broken because they're lacking a quote. 'hello world.

People like to try to make Strings using single quotes. But 'abc' is a symbol literal followed by an unterminated char literal. That's unfriendly.

They do succeed as lightweight syntax. Replacing them is an area of research. Fortunately, EPFL is a research institution.

Partisans of symbol literal syntax should not despair. The XMLers rallied to keep angle bracket syntax. Scala Days 2019 is an ideal opportunity to organize some sit-ins and occupy EPFL. That would be a symbolic gesture.

SethTisue commented 5 years ago

here in Scala 2 land, https://github.com/scala/scala/pull/7495 is merged and https://github.com/scala/scala/pull/7395 will be merged soon

apologies to those who are displeased by this decision, and thanks to everyone who contributed to the discussion.

SethTisue commented 5 years ago

it's not clear to me if there was a consensus on deprecating scala.Symbol itself – although nobody actually objected (except about the technical difficulty around @deprecatedName), the discussion was mostly focused on the literal-syntax aspect.

if we do intend to deprecate scala.Symbol, but it doesn't happen for 2.13, then it could be annoying to library authors to do two rounds of changes, one to switch from 'foo to sym"foo" and then another, later, to ditch Symbol entirely. otoh, time left for further changes in 2.13 is short. I'm okay with delaying any decision until 2.14, but if someone wants to PR the deprecation for 2.13, maybe that PR would have a chance.

jeffrey-aguilera commented 5 years ago

If the tick syntax is no longer supported, why didn't this go through the normal deprecation policy?

SethTisue commented 5 years ago

If the tick syntax is no longer supported, why didn't this go through the normal deprecation policy?

the tick syntax will still work in 2.13, it'll just be deprecated (see https://github.com/scala/scala/pull/7395). the actual removal will come in 2.14

nafg commented 5 years ago

I'm usually more in favor of more incremental changes than "if we're going to make changes let's get them all in" (like dotty seems to be doing...)

On Tue, Feb 5, 2019 at 7:31 PM Seth Tisue notifications@github.com wrote:

If the tick syntax is no longer supported, why didn't this go through the normal deprecation policy?

the tick syntax will still work in 2.13, it's just be deprecated (see scala/scala#7395 https://github.com/scala/scala/pull/7395). the actual removal will come in 2.14

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/scala/scala-dev/issues/459#issuecomment-460859530, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGAUJWzSdZ9nu6mnW8pakPqM0zJ-CgTks5vKiJhgaJpZM4RA5Rt .