scala / bug

Scala 2 bug reports only. Please, no questions — proper bug reports only.
https://scala-lang.org
232 stars 21 forks source link

Slow, exponential compile times for HLists > 200 elements #8478

Open scabug opened 10 years ago

scabug commented 10 years ago

This is a blocker for longer HLists of sizes of a few hundred elements, which is not uncommon for database table column numbers, e.g. http://stackoverflow.com/questions/22842486/eclipse-scala-ide-slow-and-crashes-caused-by-slick-generated-hcon-hlist

Hopefully there is room for speedup in Scalac for this.

I measured the following compile times on my machine:

Clearly the compile times are exponential in the length of the HList.

Reproduce code:

trait Tables {
  type Column[T]
  type HList
  type HNil <: HList
  type HCons[+Value, +Tail <: HList] <: HList
  type :: [+Value, +Tail <: HList] = HCons[Value, Tail]

  trait ShapeLevel
  object ShapeLevel{
    trait Nested extends ShapeLevel
    trait Flat extends Nested
    trait Columns extends Flat
  }

  abstract class Shape[Level <: ShapeLevel, -Mixed_, Unpacked_, Packed_]
  abstract class HListShape[Level <: ShapeLevel, M <: HList, U <: HList, P <: HList] extends Shape[Level, M, U, P]

  implicit def primitiveShape[Level <: ShapeLevel]: Shape[Level, String, String, Column[String]]
  implicit def hnilShape[Level <: ShapeLevel]: HListShape[Level, HNil, HNil, HNil]
  implicit def hconsShape[Level <: ShapeLevel, L1 <: Level, L2 <: Level, M1, M2 <: HList, U1, U2 <: HList, P1, P2 <: HList](implicit s1: Shape[L1, M1, U1, P1], s2: HListShape[L2, M2, U2, P2]) : HListShape[Level, M1 :: M2, U1 :: U2, P1 :: P2]

  type HList10[T <: HList] = HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,T]]]]]]]]]]
  type HList100[T <: HList] = HList10[HList10[HList10[HList10[HList10[HList10[HList10[HList10[HList10[HList10[T]]]]]]]]]]

  type JobRow = HList10[HList10[HList10[HList10[HList10[HList10[HList100[HList100[HNil]]]]]]]] // size 260

  implicitly[Shape[_ <: ShapeLevel.Flat, JobRow, JobRow, _]] 
}
scabug commented 10 years ago

Imported From: https://issues.scala-lang.org/browse/SI-8478?orig=1 Reporter: @cvogt Affected Versions: 2.10.4

scabug commented 10 years ago

@retronym said: Scala version?

scabug commented 10 years ago

@retronym said: Looks like the problem is in the detection of implicit divergence detection.

Applying this patch to master:

~/code/scala git diff
diff --git a/src/compiler/scala/tools/nsc/typechecker/Implicits.scala b/src/compiler/scala/tools/nsc/typechecker/Implicits.scala
index d87090f..d1d6384 100644
--- a/src/compiler/scala/tools/nsc/typechecker/Implicits.scala
+++ b/src/compiler/scala/tools/nsc/typechecker/Implicits.scala
@@ -386,38 +386,7 @@ trait Implicits {
      *  Two types overlap if they have the same type symbol, or
      *  if one or both are intersection types with a pair of overlapping parent types.
      */
-    private def dominates(dtor: Type, dted: Type): Boolean = {
-      def core(tp: Type): Type = tp.dealiasWiden match {
-        case RefinedType(parents, defs)         => intersectionType(parents map core, tp.typeSymbol.owner)
-        case AnnotatedType(annots, tp)          => core(tp)
-        case ExistentialType(tparams, result)   => core(result).subst(tparams, tparams map (t => core(t.info.bounds.hi)))
-        case PolyType(tparams, result)          => core(result).subst(tparams, tparams map (t => core(t.info.bounds.hi)))
-        case _                                  => tp
-      }
-      def stripped(tp: Type): Type = {
-        // `t.typeSymbol` returns the symbol of the normalized type. If that normalized type
-        // is a `PolyType`, the symbol of the result type is collected. This is precisely
-        // what we require for SI-5318.
-        val syms = for (t <- tp; if t.typeSymbol.isTypeParameter) yield t.typeSymbol
-        deriveTypeWithWildcards(syms.distinct)(tp)
-      }
-      def complexity(tp: Type): Int = tp.dealias match {
-        case NoPrefix                => 0
-        case SingleType(pre, sym)    => if (sym.hasPackageFlag) 0 else complexity(tp.dealiasWiden)
-        case ThisType(sym)           => if (sym.hasPackageFlag) 0 else 1
-        case TypeRef(pre, sym, args) => complexity(pre) + (args map complexity).sum + 1
-        case RefinedType(parents, _) => (parents map complexity).sum + 1
-        case _                       => 1
-      }
-      def overlaps(tp1: Type, tp2: Type): Boolean = (tp1, tp2) match {
-        case (RefinedType(parents, _), _) => parents exists (overlaps(_, tp2))
-        case (_, RefinedType(parents, _)) => parents exists (overlaps(tp1, _))
-        case _                            => tp1.typeSymbol == tp2.typeSymbol
-      }
-      val dtor1 = stripped(core(dtor))
-      val dted1 = stripped(core(dted))
-      overlaps(dtor1, dted1) && (dtor1 =:= dted1 || complexity(dtor1) > complexity(dted1))
-    }
+    private def dominates(dtor: Type, dted: Type): Boolean = false

     /** The expected type with all undetermined type parameters replaced with wildcards. */
     def approximate(tp: Type) = deriveTypeWithWildcards(undetParams)(tp)

Takes compilation down for me from 1:18 to 0:16.

This suggests one workaround: you could make the hlist implicit a macro, which disables divergence checking.

Another possibility, which I'll try out soon, is to add an "fast track" implicit for HLists of, say, size 8, and arrange for this to be prioritized over the hlist of size 1 implicit.

scabug commented 10 years ago

@cvogt said: 2.10.4

scabug commented 10 years ago

@cvogt said: Is there going to be a 2.10.5?

scabug commented 10 years ago

@retronym said: Yep, one is planned. But I'm not sure of a way to fix this in the compiler.

scabug commented 10 years ago

@cvogt said: How does an implicit macro disable divergence checking? Does this happen automatically and if so why?

scabug commented 10 years ago

@retronym said: https://github.com/scala/scala/commit/8168f118c9

scabug commented 10 years ago

@retronym said: Yes, all implicit macros are excluded from divergence checking automatically.

scabug commented 10 years ago

@retronym said: Indeed flattening out the implicit search also drastically helps with performance (down to 00:06!):

trait Tables {
  type Column[T]
  type HList
  type HNil <: HList
  type HCons[+Value, +Tail <: HList] <: HList
  type :: [+Value, +Tail <: HList] = HCons[Value, Tail]

  trait ShapeLevel
  object ShapeLevel{
    trait Nested extends ShapeLevel
    trait Flat extends Nested
    trait Columns extends Flat
  }

  abstract class Shape[Level <: ShapeLevel, -Mixed_, Unpacked_, Packed_]
  abstract class HListShape[Level <: ShapeLevel, M <: HList, U <: HList, P <: HList] extends Shape[Level, M, U, P]

  trait LowPriorityImplicits2 {
    implicit def hnilShape[Level <: ShapeLevel]: HListShape[Level, HNil, HNil, HNil] = ???
    implicit def hconsShape[Level <: ShapeLevel, L1 <: Level, L2 <: Level, M1, M2 <: HList, U1, U2 <: HList, P1, P2 <: HList](implicit s1: Shape[L1, M1, U1, P1], s2: HListShape[L2, M2, U2, P2]) : HListShape[Level, M1 :: M2, U1 :: U2, P1 :: P2] = ???
  }
  trait LowPriorityImplicits1 extends LowPriorityImplicits2 {
    // This implicit is redundant (hconsShape would be sufficient) but it serves to
    // limit the depth of implicit search through `hconsShape`, which has
    // exponential performance (albeit at a low constant factor) shows up with hlists
    // in the order of 100 elements. See SI-8478.
    implicit def hconsShape8[Level <: ShapeLevel, 
         L1 <: Level, L2 <: Level, L3 <: Level, L4 <: Level, L5 <: Level, L6 <: Level, L7 <: Level, L8 <: Level, 
         M1, M2, M3, M4, M5, M6, M7, M8 <: HList, 
         U1, U2, U3, U4, U5, U6, U7, U8 <: HList, 
         P1, P2, P3, P4, P5, P6, P7, P8 <: HList](
          implicit 
          s1: Shape[L1, M1, U1, P1], 
          s2: Shape[L2, M2, U2, P2], 
          s3: Shape[L3, M3, U3, P3], 
          s4: Shape[L4, M4, U4, P4], 
          s5: Shape[L5, M5, U5, P5], 
          s6: Shape[L6, M6, U6, P6], 
          s7: Shape[L7, M7, U7, P7], 
          s8: HListShape[L8, M8, U8, P8]
     ) : HListShape[Level, M1 :: M2 :: M3 :: M4 :: M5 :: M6 :: M7 :: M8,
                           U1 :: U2 :: U3 :: U4 :: U5 :: U6 :: U7 :: U8,
                           P1 :: P2 :: P3 :: P4 :: P5 :: P6 :: P7 :: P8] = ???
  }
  object Implicits extends LowPriorityImplicits1 {
    implicit def primitiveShape[Level <: ShapeLevel]: Shape[Level, String, String, Column[String]] = ???    
  }

  import Implicits._

  type HList10[T <: HList] = HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,T]]]]]]]]]]
  type HList100[T <: HList] = HList10[HList10[HList10[HList10[HList10[HList10[HList10[HList10[HList10[HList10[T]]]]]]]]]]

  type JobRow = HList10[HList10[HList10[HList10[HList10[HList10[HList100[HList100[HNil]]]]]]]] // size 260

  implicitly[Shape[_ <: ShapeLevel.Flat, JobRow, JobRow, _]] 
}
scabug commented 10 years ago

@cvogt said: Awesome man! Added 10 and 100 versions to Slick now reaching about 10 seconds for size 500, 30 seconds for size 1000. That should be enough for our use case. https://github.com/slick/slick/pull/749 Thx

scabug commented 10 years ago

@szeiger said: I don't even want to think about what people are doing with projections of 1000 elements.

scabug commented 10 years ago

@retronym said: Here's a WIP branch that shows where we could find some more efficiency in the compiler: https://github.com/retronym/scala/tree/ticket/8478

scabug commented 10 years ago

@cvogt said (edited on Apr 7, 2014 11:08:37 PM UTC): I am trying to track down the performance problem in the following, which was introduced by the supposed fix (https://github.com/slick/slick/pull/749), which actually made it worse. I got the code down to this:

import scala.reflect.ClassTag
import scala.slick.ast.{TypedType}
import scala.slick.lifted.FlatShapeLevel
import scala.slick.lifted.Shape
import scala.slick.collection.heterogenous._
trait Tables {
  implicit def stringColumnType: TypedType[String] = ???
  type BranchRow = HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HNil.type]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]

  implicitly[Shape[_ <: FlatShapeLevel, BranchRow, BranchRow, _]]
  def x = implicitly[Shape[_ <: FlatShapeLevel, HCons[String,HNil.type], _, _]]
  val i = x
}

This takes like 20 seconds on my machine (add 6 elements to the list for 60 seconds). The odd thing is, if you remove the line with the val, it compiles in 1 second.

scabug commented 10 years ago

@retronym said: Can you please make an SBT project with that code and the right dependency on Slick? I can't compile that in my checkout:

commit 11c58a468d3041d1320c7d749dd9a29caf4564da (HEAD, origin/master, origin/HEAD, master)
Merge: 6c7d503 e2ece2f
Author: Stefan Zeiger <szeiger@novocode.com>
Date:   Wed Jan 22 07:20:55 2014 -0800

    Merge pull request #617 from slick/tmp/consistent-doc-versions

    Pass Slick version properly from sbt into Sphinx.

  ~/code/slick sbt slick/test:console
...
scala> import scala.reflect.ClassTag
import scala.reflect.ClassTag

scala> import scala.slick.ast.{TypedType}
import scala.slick.ast.TypedType

scala> import scala.slick.lifted.FlatShapeLevel
<console>:9: error: object FlatShapeLevel is not a member of package scala.slick.lifted
       import scala.slick.lifted.FlatShapeLevel
              ^

scala> import scala.slick.lifted.Shape
import scala.slick.lifted.Shape

scala> import scala.slick.collection.heterogenous._
import scala.slick.collection.heterogenous._

scala> trait Tables {
     |   implicit def stringColumnType: TypedType[String] = ???
     |   type BranchRow = HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HNil.type]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]
     |
     |   implicitly[Shape[_ <: FlatShapeLevel, BranchRow, BranchRow, _]]
     |   def x = implicitly[Shape[_ <: FlatShapeLevel, HCons[String,HNil.type], _, _]]
     |   val i = x
     | }
<console>:18: error: not found: type FlatShapeLevel
         def x = implicitly[Shape[_ <: FlatShapeLevel, HCons[String,HNil.type], _, _]]
                                       ^
<console>:17: error: not found: type FlatShapeLevel
         implicitly[Shape[_ <: FlatShapeLevel, BranchRow, BranchRow, _]]
                               ^
scabug commented 10 years ago

@cvogt said: repo: git@github.com:cvogt/slick-presentation.git branch: tmp/hlistSpeed

https://github.com/cvogt/slick-presentation/tree/tmp/hlistSpeed

scabug commented 10 years ago

@retronym said (edited on Apr 22, 2014 11:16:39 AM UTC):

Jason: Just to refresh my memory: this project shows code that is fast with 2.10.4 but slow with 2.11.x

cvogt: no. Slow with 2.10.4
removing this line makes it fast: https://github.com/cvogt/slick-presentation/blob/tmp/hlistSpeed/src/main/scala/SlickPresentation.scala#L12
which is odd
that's the problem.
why does this line make it slow?

Jason: Okay, then maybe there are two problems that this exposes...
I'm bisecting the problem that it is massively slow in 2.11

It got slow after this "refactoring": https://github.com/scala/scala/commit/9c09c170998f74fba03990977b285e3121db32a6#diff-7c03397456d3b987549fd039d6b639c6L667
But I remember that line, and it was since reverted as part of SI-6966
so I need to see if it sped up again at that point...
sigh...

cvogt: thanks man :-\

Jason: The reversion, https://github.com/scala/scala/commit/58bfa19, doesn't restore performance. Maybe that line wasn't the culprit.
scabug commented 10 years ago

@retronym said: I checked out 9c09c170998f74fba03990977b285e3121db32a6 and reverted the change to typedImplicit (essentially cherry-picked 58bfa19). This did restore performance!

scabug commented 10 years ago

@retronym said: Rebased onto that change, things get slow again in https://github.com/scala/scala/commit/ea93654a6e9674efa9ab98328462d3b5f59ce91f, Paul's refactoring of variance checking. Deja vu.

scabug commented 10 years ago

@retronym said: [~cvogt] Adding the apparently unrelated val has the effect of changing the order in which the two implicit searches are executed.

More directly, this takes 23 seconds to typecheck:

import scala.reflect.ClassTag
import scala.slick.ast.{TypedType}
import scala.slick.lifted.FlatShapeLevel
import scala.slick.lifted.Shape
import scala.slick.collection.heterogenous._
trait Tables {
  implicit def stringColumnType: TypedType[String] = ???
  type BranchRow = HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HCons[String,HNil.type]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]

  implicitly[Shape[_ <: FlatShapeLevel, HCons[String,HNil.type], _, _]]
  implicitly[Shape[_ <: FlatShapeLevel, BranchRow, BranchRow, _]]
}

As opposed to 3 seconds with:

  implicitly[Shape[_ <: FlatShapeLevel, BranchRow, BranchRow, _]]
  implicitly[Shape[_ <: FlatShapeLevel, HCons[String,HNil.type], _, _]]

If, using the patch below, we remove the "use count" based pre-sorting of implicits (which implments the strategy to try popular implicits before less popular ones of the same specificity), we return to deterministic speed. This happens to be fast, but I suspect that this is by chance, and depends on the declaration order of some implicits.

git diff -U1
diff --git a/src/compiler/scala/tools/nsc/typechecker/Implicits.scala b/src/compiler/scala/tools/nsc/typechecker/Implicits.scala
index d87090f..fcc7d75 100644
--- a/src/compiler/scala/tools/nsc/typechecker/Implicits.scala
+++ b/src/compiler/scala/tools/nsc/typechecker/Implicits.scala
@@ -881,4 +881,3 @@ trait Implicits {

-        // most frequent one first
-        matches sortBy (x => if (isView) -x.useCountView else -x.useCountArg)
+        matches
       }
scabug commented 10 years ago

@retronym said: I've fixed the 2.10.4 -> 2.11.0 performance regression: https://github.com/scala/scala/pull/3694

SethTisue commented 6 years ago

@milessabin care to update the status on this?

milessabin commented 6 years ago

The inductive implicits PR interacts with byname implicits, so it's stuck in a queue behind that.

adriaanm commented 6 years ago

Is this fixed by https://github.com/scala/scala/pull/6580?

milessabin commented 6 years ago

@adriaanm I see a very marginal speedup (a couple of %, consistently better, but small enough to be counted as noise) with scala/scala#6580 on the original case. I think the conclusion above was that the original case was somewhat pathological.

I'd give the revised example a spin, but it isn't standalone and we don't have a 2.13.x Slick build.