Closed SethTisue closed 7 years ago
I would really like for fastparse to be part of the standard distribution (even if in a module and not the standard library). It's the kind of tool--often needed, hard to get right yourself, saves a ton of time if it's there--that is really a win for a standard library. I ended up creating my own parsing library focused on extreme performance (comparable to hand-rolled in many cases). Even though it's a bit fiddly to use, I use it all over the place. I'd use fastparse instead probably 80% of the time if there was no extra dependency to pull in and a more-certain maintenance future.
a significant past discussion on this (started by @soc in November 2014): https://groups.google.com/d/msg/scala-internals/4N-uK5YOtKI/teWdFXR290oJ
The standard distribution should not include a parser combinator library.
Could have both the old one and a new one in parallel for while.
Currently parser-combinators are already a separate module in the core library, also they are primarily community maintained.
I've written up a section on Debugging Parsers in the FastParse documentation. If you actually care about parser combinators you should check it out, because this is one facet where FastParse is probably best-in-industry AFAICT.
I'd prefer dropping scala-parser-combinators, and writing some migration document pointing to fastparse. (Pretty much similar to what happened with scala-actors vs. akka.)
@soc :+1:
I'd prefer dropping scala-parser-combinators, and writing some migration document pointing to fastparse. (Pretty much similar to what happened with scala-actors vs. akka.)
I'd happy to give someone from The Powers That Be™ collaborator access to FastParse, if there's someone I trust actually cares about things.
So far, I've seen zero interest: no questions, no issues raised, no discussions on gitter, no PRs sent. There are people using it, but as far as I can tell nobody involved in this discussion has even downloaded the thing at all or read any documentation (not that there's any shortage of it).
It's probably a bit premature to talk about how we can FastParse the Official™ parser-combinator library =D
Once I finish wading through writing least squares fitting routines I'll need to parse stuff, and I'll use fastparse. I will probably complain bitterly that ~! is nothing like ! and thus is poorly chosen notation. Besides, it's a cut. Don't you want scissors notation? :<
?
I will probably complain bitterly that ~! is nothing like ! and thus is poorly chosen notation.
I just copied that from scala-parser-combinators! I had no active decision in picking it, I also haven't really seen Cut used much on the internet (despite how important I think they are) and I can't find any other examples of it used except in scala-parser-combinators and MacroPEG (which is again my own). After I managed to convince @sirthias to put cuts into Parboiled2, he chose ~!~
which IMHO is not much different.
As far as the name blame game goes, it seems to go to 2008 @adriaanm
(github doesn't do recursive blame AFAIK, so sorry @adriaanm if you are being wrongly blamed)
I don't think scissors have enough cuttiness in them; perhaps a sword would be more appropriate
p1 =||====> p2 | p3 =||====> p4
I still think the notation could stand re-examination. I am not sure I want to sever my parsing so severely as the sword suggests. I guess ~!~
is okay. It's pretty easy to type, at least. I'd have just ~~
'ed if it were me. Maybe not obvious enough? I'll play with it a bit once I start parsing things again.
Here's my talk at SF Scala about FastParse
I've been using FastParse for most of the day, which is the longest contiguous block of time and most intensively I've used it. I can confirm my initial hunch that it's way better than parser combinators.
I hate the ~!
notation for cuts. It's too hard to glance at code and know what's a !
and what's a !p
and what's a ~!
. Worse yet, you can typo ~ !"foo"
as ~! "foo"
and get no error while exactly flipping what your parsing is supposed to be. Argh!
Other than that, everything's delightful. (Well. Pretty much. I'd make a dozen or so changes to suit my preferences, but none of those are a big deal.)
Chiming in: I've been using FastParse in production for a couple of months now (eg, for this MySQL parser) and would say that, while I suspect that we'll find ways to enhance it over time, it's already a good deal more pleasant to use than parser combinators -- generally easier to use, a bit easier to debug (although I'm pondering how we might improve that), and bat-out-of-hell fast.
That said, I don't actually feel any crying need for it to be in the standard library. This is one of those areas where letting the proverbial thousand flowers bloom is useful. Putting a pointer to it from somewhere in the core documentation, sure, so it's easy to find, but I don't think it requires a particularly privileged position.
In general, I like the way the Scala.js community has handled this sort of thing: useful libraries get linked from the homepage, but they aren't especially privileged. The community gets to figure out what works best, and the best rises to the top through GitHub stars, links from tutorials, and like that.
It might be enough to add a paragraph to the scala-parser-combinators README, and perhaps to relevant locations on the Scala website and/or Scala doc website, that basically says:
book authors (e.g. of the Odersky, Venners & Spoon book) could make their own decisions about what to say about this, and which library to present, when they publish their next editions, and/or in their online errata
additionally, we could consider unbundling scala-parser-combinators from 2.12 or 2.13 (presumably unbundling scala-swing as well). I didn't participate in the discussions at the time they were modularized, but I assume that the primary reason they are still bundled is that they are used in the Odersky, Venners & Spoon book.
unbundling wouldn't affect sbt users (who already must explicitly depend on Scala modules such as scala-xml and scala-parser-combinators), but would affect users using command-line scala
and scalac
Yeah, unbundling parser-combinators seems appropriate at this point -- it doesn't seem to be getting much love, and I don't think it really needs the privileged position either. (Mind, I'm still using it on both the JVM and JS sides. But my new code is all FastParse, and I'm eventually going to rewrite the remaining parser-combinator parser that way.)
@ichoran syntax is trivial; you could even change it yourself in your code without touching the library, though I won't claim the status quo is perfect. Semantic versioning says during 0.2.x I can replace the library with a turnip and still be compliant
@SethTisue I'd be happy with that outcome.
Does anyone feel like taking point on this issue to keep it moving forward?
@SethTisue I am completely in favor of unbundling parser-combinators as a mean to say "scala-parser-combinators is not the preferred parsing library in scala".
But there is another reason why it is bundled with scala: it is used by scaladoc. The json parser is used to generate the index of packages and classes for the scaladoc left pane. And since there are no standard json library (wink wink nudge nudge...), there isn't much of a choice here...
And the standard distribution doesn't differentiate between artifacts that are "just dependencies" of stuff and those that are "officially part of the beginner experience" (like akka-actors were): both end up on the REPL classpath (actually, everything ends on the classpath: the bash runner indiscriminately adds all jars it can find; I tried to tackled that once but gave up in the end...).
The second part of the modularization effort would solve the problem I think:
@dickwall - I'll take point on this. At the same time, I'm planning to propose that we adopt jawn-parser for the standard JSON parser and use it to generate a standard parse into json4s-ast. That should cover @gourlaysama's point about scaladoc needing parsing.
@gourlaysama thanks for the reminder about Scaladoc's dependency on util.parsing
. I had indeed forgotten.
@Ichoran but it appears to me that Scaladoc isn't parsing JSON, just generating it. see https://github.com/scala/scala/blob/2.12.x/src/scaladoc/scala/tools/nsc/doc/html/page/IndexScript.scala . so unless I'm missing something, cutting the dependency looks quite easy
@SethTisue - That makes it easier to put off doing anything with jawn-parser. (Not sure that is a good thing?)
@Ichoran well even if we had jawn-parser as a standard module, we would still want to avoid depending on it from scaladoc, since any Scala dependency that scaladoc has complicates bootstrapping
@SethTisue - What does Scaladoc have to do with bootstrapping? It doesn't generate anything required during execution of the compiler, does it? Seems like you just do whatever you do, rebuilding whatever else you need to rebuild, and then run Scaladoc at the end. If you mean, "but the standard module breaking would break the build", well, yeah? Isn't that what we'd want anyway if the standard module broke? Fix that first. Then Scaladoc is up again.
Scaladoc is not directly involved during bootstrap, as defined by and strap.done
(all you need is lib/reflect/comp), but its dependency list should shrink -- not grow. It doesn't look like it needs to parse JSON -- it's just generating it (as well as html, which it uses scala-xml for). Pragmatically, I think we should considering using string interpolation to cut the cord in both instances.
In a larger sense, scaladoc is of course part of the bootstrap (i.e., release) process.
thanks @adriaanm for clearing up my poor wording.
(this stuff about the dependency of scalac/scaladoc on scala-xml and scala-parser-combinators is now https://issues.scala-lang.org/browse/SI-9560, let's have any further discussion there since it's somewhat tangential to the main purpose of this SLIP)
I think someone else would have to take the lead on getting FastParse "in", whatever that means. I'm happy for @Ichoran to do it, or if anyone else wants to pick up this cup. It feels to me like if FastParse isn't good enough that someone non-me is excited about it, it shouldn't go in the standard anyway.
So I'll just make FastParse the best I can and you guys can decide whether it's worth your time to do something about it. I'll certainly co-operate if something is asked of me, but I'm pretty happy to not be the one driving things =)
The scaladoc -> parser-combinators dependency is on its way out: https://github.com/scala/scala/pull/4851
On Wed, Nov 18, 2015 at 8:27 AM, Li Haoyi notifications@github.com wrote:
I think someone else would have to take the lead on getting FastParse "in", whatever that means. I'm happy for @Ichoran https://github.com/Ichoran to do it, or if anyone else wants to pick up this cup. It feels to me like if FastParse isn't good enough that someone non-me is excited about it, it shouldn't go in the standard anyway.
So I'll just make FastParse the best I can and you guys can decide whether it's worth your time to do something about it. I'll certainly co-operate if something is asked of me, but I'm pretty happy to not be the one driving things =)
I completely understand, and don't want to convince you to change your mind. But you see, that's the general problem we are facing. We do not have people in the wings to pick up a great prototype and stabilize it and maintain it. If something should improve it has to come from the community. So, if someone wants to pick this up and move it forward it that would be great! If not, things will simply stay as they are.
—
Reply to this email directly or view it on GitHub https://github.com/scala/slip/issues/24#issuecomment-157631042.
Martin Odersky EPFL and Typesafe
I will pick it up (while wearing my "member of the community" hat).
This discussion could be revived under the new Scala Platform Process (http://www.scala-lang.org/blog/2016/11/28/spp.html).
the following exchange between @odersky and @lihaoyi occurred on scala-debate recently:
some further discussion on Gitter: