rorygraves / scalac_perf

The Scala programming language
http://www.scala-lang.org/
16 stars 3 forks source link

parallel IO and parse for source #16

Open mkeskells opened 7 years ago

mkeskells commented 7 years ago

currently the source files are read one by one and then split into lines and then parsed We should be able to do some or all of that in parallel and use nio to help

mkeskells commented 6 years ago

initial work in https://github.com/scala/scala/compare/2.12.x...rorygraves:mike/2.12.x_source-reader?expand=1 - shows slightly faster with NIO

I had a dig around SBT - after PRing a delete fix (https://github.com/sbt/io/pull/133) Jason found the read /scan operations

https://github.com/sbt/io/blob/77d3ba2fd47aa24be7cff3666223253024dea7ff/io/src/main/scala/sbt/io/Path.scala#L363-L370 https://github.com/sbt/io/blob/77d3ba2fd47aa24be7cff3666223253024dea7ff/io/src/main/scala/sbt/io/Path.scala#L326-L327 https://github.com/sbt/io/blob/77d3ba2fd47aa24be7cff3666223253024dea7ff/io/src/main/scala/sbt/io/Path.scala#L432-L446 all perfect candidates for nio file walkers currently it seems to walk twice, once to find includes, again to find excludes, then it subtracts the two sets. This should be fused in to a walk

not sure what gradle & zinc uses to scan the directory, but it should use file walkers where possible rather than java IO. java IO uses stats call for each operation the hits the OS (length, etc), NIO file walker uses 1 stat call per directory!

There should be some improvement in zinc and gradle possible if the IO can start before the compile (its a snapshot file), but the cost is small, so benefit is small. The bigger advantage would be in gradle/zinc itself (happy to claim that too though)

https://github.com/sbt/io/issues/37 seems also related, but unstarted

Other pre-typer phases should be able to run in parallel with some rework. Probably a POC and round trip with Jason/Lucas, as we work out the benefit