juanrh / sscheck

ScalaCheck for Spark
Apache License 2.0
63 stars 9 forks source link

Version incompatibilities with spark-testing-base #36

Closed juanrh closed 8 years ago

juanrh commented 8 years ago

Some versions of spark-testing-base are not compatible with sscheck, resulting in non serializable exceptions for tests using checkpointing. The problem is that some versions of spark-testing-base use a version of "io.github.nicolasstucki" %% "multisets" defining an implicit value config that causes this serialization problem (see e.g. https://github.com/holdenk/spark-testing-base/blob/7d4077efacd5ed454aededcf9278ddd7953dad80/src/main/1.3/scala/com/holdenkarau/spark/testing/JavaStreamingSuitebase.scala). This is weird because we only use com.holdenkarau.spark.testing.TestInputStream as a transient val, maybe it has something to do with the fact that that value is implicit. Anyway a simple solution is copying the classes to this project, thus removing the dependency and effectively freezing it

juanrh commented 8 years ago

If looks like the culprit should be https://github.com/juanrh/sscheck/blob/6b6020ec9f2dfbe54462a545564dbd153143fceb/src/main/scala/es/ucm/fdi/sscheck/gen/Batch.scala instead.

import scala.collection.immutable.{HashBag=>Bag}
...
  implicit val config = Bag.configuration.compact[A]

  override def toSeq : Seq[A] = points
  def toBag : Bag[A] = Bag(points:_*) 
...

That variable config should be transient. The multisets library has known problems with serialization https://github.com/nicolasstucki/multisets/issues/9, Note Batch.toBag and PDStream.toBagSeq are not used anyway, so I'll remove them too

juanrh commented 8 years ago

released new version 0.2.3 without those dependencies, all tests passing in Jenkins