What should go into the Scala Standard Library - discussion

dickwall commented 8 years ago

PLEASE READ Martin's brand new proposal first before commenting: https://docs.google.com/document/d/1XBUnQ4kM4QtxIXGerP1fLbVBcVrlx0YYpybeQpmVY0w/edit#heading=h.b78f5xct55cm

Comments are invited on the scala/slip gitter channel or in comments here. Please be professional and respectful.

dickwall commented 8 years ago

Possible application of package aliases to try experimental implementations?

geggo98 commented 8 years ago

My suggestion: Make it slim. Put in the absolute core, and then coursier (a dependency fetcher in pure Scala, similar to Ivy) and a good REPL (e.g. Ammonite). The user experience will be fine, because the can work in the REPL and let the tool just fetch dependencies as the she needs it (e.g. by entering @load("org.scala-lang" % "scala-reflect" % "2.10.3")).

My reasoning: With project Jigsaw, Oracle tries to get the Java VM to a slim size and tries to get the startup times down. This will hopefully come with Java 9 (e.g. with JEP 220).I think Scala should follow. Ideally we would get a very slim distributions containig the Java VM plus the Scala core. It could so fast that the user cannot notice any delay be frugal with memory. This won't work with a fat Scala distribution, containing everything the user could ever need.

Additionally with such an approach, it would be easy to leave fast evolving parts out of the standard and inside the ecosystem. The Scala core could then evolve with a glacial speed, similar like Debian stable. And the users could just use the newest how stuff from the ecosystem.

Being fast, stable and having a vital ecosystem could bring Scala to new niches, currently filled by languages like Python or Go. Scala could even be used as a kind of scripting language, e.g. for DevOps automation.

Of course this all depends on project jigsaw being successful.

martijnhoekstra commented 8 years ago

Looking at

"I would take the pre-installed python package, or if it's not there install it"

A major problem here is that the Scala ecosystem has no easy way to do that. There are multiple ways to solve that

One of the ways is to create an easy way to do that. This is possibly/probably hard. Python has no proper way for that either AFAIK. You have to muck about with PIP (which ironically is a terrible pain to install on Windows), and when you want to have python3 side by side with python 2 everything(tm) breaks in my experience.

Another possible solution to that is to give the REPL a "command mode" where you can load the libraries (for example the @load option of geggo98 above).

A third option is to put everything in the standard library to avoid the issue.

A fourth option is to concede that to write something that isn't entirely trivial, you need to learn to leverage a build tool like SBT (somewhat analogous to the PIP/virtualenv situation in Python).

All of those have downsides and upsides. That having a large standard library solves this problem doesn't mean that it's the only or best way to solve this problem, and without considering other possible solutions isn't much of an argument for a large standard library.

SethTisue commented 8 years ago

@InTheNow did some thinking on this previously at https://github.com/scala/slip/pull/25

brianwawok commented 8 years ago

@martijnhoekstra Python multiple environments is dealt with via virtualenv https://virtualenv.readthedocs.org/en/latest/

Which is not an internal to python thing, but an external wrapper that mucks with classpaths.

BrianLondon commented 8 years ago

@load("org.scala-lang" % "scala-reflect" % "2.10.3")

As much as I like the idea of selective core module fetching/loading, everything in the the platform should be of a unified version. The python experience of easy download/install of libraries is great for simple projects (which also exists with SBT fulfilling the function of pip among it's others). Those pip installed libraries can become a nasty web of dependencies pretty easily on a large project, necessitating tools like venv to try allow a completely different ecosystem for each project you're developing. Even then, venv doesn't solve the problem of conflicting dependencies.

Ichoran commented 8 years ago

@geggo98 - The proposal is to make the core library increasingly slim, but also to have a well-defined "platform" that provides a more comprehensive set of tools. If you want slim, depend on the core. Are you suggesting anything different?

@martijnhoekstra - It's not clear to me that "command mode" as described by geggo is enough easier than SBT to be worth anything. You have lengthy and cryptic additional commands that require you to know exactly what you want but don't help you find it. So it seems like three options: (1) do something that is difficult and we're not sure we can do; (2) make a Scala Platform library in addition to the core; (3) require a newer user to have more expertise before getting off the ground to do anything interesting. This seems to argue pretty clearly that (2) is the way to go, no?

martijnhoekstra commented 8 years ago

@Ichoran If I understand this proposal correctly, when you install "the scala platform" you get scalac and the scala REPL on your path, and the scala core libs and scala platform libs on your classpath without having to fiddle with them, under the assumption that this is the most friendly way to get started and on your way with Scala (without disturbing the non-newcomers).

I don't think that's true. At some point, almost everybody will transition to building with sbt, which is currently the de-facto standard build tool. I think that point comes sooner rather than later, before a significant part of the platform libraries is used, and despite having a larger platform library, that knowledge will be required regardless so soon that not having to know/learn about it is a non argument to me.

I see much more in a "scala platform" that includes sbt, and something like a "default template" that includes the dependency on a Scala platform meta-package, and preferably the default resolvers as well. The advantage I see there is that it makes "magic" explicit. You run the thing, you get the "magic" of batteries-included, and you also have an immediate auto-generated example of how to include a library dependency and resolver in sbt. This could be as simple as a shell script that copies the default template to the working directory and launches sbt -console.

This will put people immediately on a path where the next step is discoverable. A difference between "batteries included" and "batteries pre-installed" if you want to go with that analogy. You get the batteries in both cases, but in one of them you are shown (automatically) how to put them in, and in the other, they are already in, but you have far less of an idea how to replace them if needed.

Having a broad standard platform seems like a great idea to me. Automatically adding it to the classpath without making it discoverable how to add other dependencies that are not in the platform less so.

Ichoran commented 8 years ago

@martijnhoekstra - That's just a generalization of the idea of a platform--it's a platform factory that just happens to come with one platform already installed. That's nice too, but it is more work to generalize, and I don't think there's any argument to not generate a platform distribution while waiting for a platform factory (built on SBT, I guess). If we get that too, even better--no reason not to migrate to it immediately if it is actually as easy to use as the REPL.

Anything that doesn't give you an extended library available after typing one thing on the command-line is needlessly complicating matters, though, and requiring you to run with a pre-set directory structure is just out. It's fine for development of a project, but not for script-style usage.

godenji commented 8 years ago

Please, lean and mean. There's a lot of emphasis on what those new to the language will expect out of the box; by all means we should definitely cater to that audience: map, flatMap, fold, zip, etc. without having to pull in an external library.

However, once we get beyond "core" library data types and functions, then the standard lib starts to balloon, and we wind up with the present day standard lib. Perhaps a small core with a set of blessed high performance external libraries would complete the picture.

Certainly smaller is better on the mobile applications front (read: Scala.js), where we can't get away with supporting present day standard lib -- not without pain at any rate (i.e. large generated binaries).

Ichoran commented 8 years ago

@godenji - That is pretty much what the proposal calls for? I can't tell whether you are agreeing with it, arguing for something different, or expressing an independent opinion that happens to pretty much align with the proposal?

godenji commented 8 years ago

@Ichoran sorry, thought this discussion was a continuation of Martin's original call for proposals (i.e. the one where things devolved into the no-standard-library vs. stay-the-course-trim-where-we-can camps).

Though now that I review @dickwall 's inital comment, we're meant to read something for context, doh ;-)

EDIT

Awesome, looks sensible.

geggo98 commented 8 years ago

@Ichoran Slim is great! I was a little bit concerned that with the recent approach to put a JSON standard in the libraries (see SLIP Bug #28), Scala would get a little bit out of shape. But if you want to slim it down, that's perfectly fine for me.

Regarding the "@load" command and the REPL: sbt is quite hard for new users (it can even be hard for experienced users). When a user has to download sbt, create a sbt file, add some magic in it and the start the sbt tool, we will probably loose this user. For Java it is planned, to integrate a REPL in Java 9 (JEP 222). Many other languages come with a useable REPL (rib for Ruby, ghci for Haskell, clojure.main for Clojure). So it would be nice, if a new user could just load a bundled Scala shell and start exploring the environment. And form my point of view it would be logical, if the loading of modules would be just a command in that shell.

From my point of view, a stripped down Ammonite REPL would be sufficient for that, probably with an additional module that then provides the full shell functionality.

SethTisue commented 8 years ago

now that I review @dickwall 's inital comment, we're meant to read something for context, doh ;-)

I just edited Dick's comment to make that harder to miss.

Ichoran commented 8 years ago

@geggo98 - Martin's proposal is, as I understand it, to have a "scala core" that is slimmed down, but to ship a "scala platform" which contains those things that one needs for a good and flexible out-of-the-box experience. The JSON AST and parser would go in there. If you didn't want it, you could use SBT to depend on scala-core (or whatever) instead of scala-platform (or whatever).

shawjef3 commented 8 years ago

@Ichoran +1

There's no reason that a batteries-included distribution and and slimmer core distribution can't coexist.

martijnhoekstra commented 8 years ago

Is this indeed the proposal? Where you can put library-dependencies += scala-core, and you won't get the rest of the platform?

Ichoran commented 8 years ago

@martijnhoekstra - I'm not sure what the "scala core" would be if you couldn't depend on it in some fashion. The distinction would be meaningless. The proposal doesn't spell it out in so many words, but I can't see any other reasonable interpretation.

erwan commented 8 years ago

@geggo98 slim is nice, but the problem when you're working on a lib is that you can conflict with your user's version.

Example: I'm writing a lib where I need json parsing. There is no standard json lib, so I pick one - let's say Play-Json. Now if I use the version from Play 2.3, people using my lib in Play 2.4 and Play 2.2 will get a conflict with my json dependency. Same goes when you want to depend on Akka on a lib (like ReactiveMongo does), you need to pick a version and risk a conflict with your users' own dependencies.

As soon as we don't make this problem worse by having even more widely used libraries, I'm all for a slim Scala core.

mdedetrich commented 8 years ago

I don't think many people disagree that the actual core should be minimal. It means maintaining the distribution of Scala/Scalac is easier, and it also means that we can reduce the runtime footprint, which matters

The real question is how we are going to deal with the platform. Although a comparison with Haskell is made, last I checked, the Haskell platform pretty much was just a "stabler" of Haskell + a set of preinstalled packages through Cabal. You would still have to define your dependencies in SBT/Cabal (which means the example of using JSON without putting it as a dependency in SBT isn't going to work)

Prepackaging the distribution as a giant jar means its hard to modularize the runtime, and I think there is good argument to try and reduce Scala's runtime, it makes it much easier to do target different platforms

I like the idea of incorporating https://github.com/lihaoyi/Ammonite into the SBT REPEL, as that provides a really easy way to download dependencies in the REPL (which is the starting point for a lot of people). That plus having an official list of Scala's dependencies (i.e. ones that are namespaced with scala/typesafe). Its actually not that hard to add something like

"org.scala" %% "json" % "1.0.0"

Either via SBT or directly in the REPL

Ichoran commented 8 years ago

I'd flip this around and ask: when would you ever not want a fat jar unless you're using a build system like SBT? This argues to me that the standard distribution could just be a single fat jar, and all the subcomponents would be published and you could depend on them instead if you were using SBT.

mdedetrich commented 8 years ago

If you are doing stuff like targeting Android and you want a minimal runtime?

Look at it this way, Clojure has a ginourmours jar as its runtime (mainly because its a fullblown LISP), and because of this Android development is not ideal because your applications take so long to load (since you have a huge jar).

I mean, assuming that most people use SBT these days, you could just including the platform in a default libraryDependencies which you can then tweak if you want to remove

I guess my point is that I don't think very many people use scalac on the command line when doing standard Scala development. Its either done by an IDE, or in a SBT session, so we should take advantage of that to the fullest extent possible

geggo98 commented 8 years ago

@erwan The problem with the conflicting versions can be handled with two different approaches. The first one is to throw technology at it (OSGi approach). The Java VM is totally capable to run the same class file in two different versions, as long as it is loaded from two different class loaders. This feature is used in the OSGi framework, e.g. when loading two Eclipse plugins with conflicting dependencies. This is great from a technological approach, but adds more bloat. See for example Apache Felix on how a "minimal" implementation looks like.

The second approach is building a platform, like Stackage does for Haskell. With the idea of the "@load" command, this becomes quite easy: Just create a well known meta-package, that only brings dependencies with it, e.g. @load("org.scala-lang", "scala-platform", "2016-04_LTS"). With a good platform with the most useful library into it, developers have a good and reliable base to avoid version conflicts. And if something doesn't work out, it is quite easy to fork the platform and build your own. After all, it is just a jar file with a text file with dependencies in its manifest directory.

This will probably be fine for most developers. It's different for enterprises. You can't just load semi-random stuff from the internet and run in production, when you manage the phone connections for a whole country, an assembly line for cars or retirement funds worth more than the gross national product of most countries. There the OSGi approach with additional digital signatures would be needed. But here I smell a market opportunity for Typesafe (or however they are named now). Enterprises usually pay very well for a stable platform with commercial support.

scalasolist commented 8 years ago

@mdedetrich Speaking of android there is another problem besides single application size. It is the total size of scala applications. Just try to imagine that average user installs few dozens of scala applications. That is compelling dream world, where scala wins over ugly java. In that case user would be forced to download and store dozens of identical scala runtimes. Some complaints about growing android application size: https://www.quora.com/What-is-a-reasonable-size-for-iOS-and-Android-apps

Google cheats the issue with preinstalling own platform once for all android applications. Scala platform needs similar solution. A local service that would centrally manage all dependencies for installed scala applications, retrieve it from a repository, check signatures, store and update. And a special classloader that could obtain bytecode from the service.

SethTisue commented 8 years ago

I'm uncomfortable with this passage in @odersky's proposal:

class modules at the beginning of their lifetime are in incubator status. They are typically modules that we would like to have in the Scala platform but that have not yet reached the level of maturity and stability to be released under the same schedule as the rest. We put these already in the scala namespace so that users do not need to edit their imports once the module becomes a stable member of scala"

The issue of needing to edit imports is not, in my view, of sufficient importance to justify this.

I think access to the Scala namespace should be via the SIP/SLIP process and shouldn't just happen because of vague future expectations about how something might eventually evolve.

Ichoran commented 8 years ago

@SethTisue - I interpreted this as meaning "we are close to 100% committed to making this work and go into the library, we just want to change it on a faster timescale", not "oh, maybe this'll be included someday, I dunno, wouldn't that be cool?"

In the former case, keeping it out of the namespace is creating meaningless busywork for people; it just adds import-editing to the list of any other edits we might make them make as part of the faster release process.

soc commented 8 years ago

The issue of needing to edit imports is not, in my view, of sufficient importance to justify this.

+1. Additionally, the specific naming choices (what's the package under scala. called?) is at this time not in control of the core team/community. So we could end up with stuff we can't/shouldn't change despite having a bad name because "it's already in scala". E. g. having scala.async despite scala.concurrent existing.

Commenting on the proposal as a whole, I think the only distribution we should have is the 1MB SBT download. People need SBT sooner or later anyway, and the confusion about "why does SBT load Scala again? I thought I already downloaded it manually????!" is just not worth it.

That's more or less what will be adopted for get-scala.org.

SethTisue commented 7 years ago

This discussion can continue at https://contributors.scala-lang.org as part of the new Scala Platform Process.

scala / slip

What should go into the Scala Standard Library - discussion #31