Closed heuermh closed 3 years ago
After further investigation, this might be a Java-called-from-Scala problem, where Kryo attempts to wrap a Java collection with a Scala one
com.esotericsoftware.kryo.KryoException: Max depth exceeded: 32
Serialization trace:
head (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
tl (scala.collection.immutable.$colon$colon)
underlying (scala.collection.convert.Wrappers$SeqWrapper)
mSequences (htsjdk.samtools.SAMSequenceDictionary)
mSequenceDictionary (htsjdk.samtools.SAMFileHeader)
at com.esotericsoftware.kryo.Kryo.beginObject(Kryo.java:1012)
at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:568)
at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:79)
at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:508)
...
at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:575)
at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:79)
at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:508)
...
at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:651)
at org.apache.spark.serializer.KryoSerializationStream.writeObject(KryoSerializer.scala:241)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$blockifyObject$2.apply(TorrentBroadcast.scala:291)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$blockifyObject$2.apply(TorrentBroadcast.scala:291)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:292)
at org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:127)
at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:88)
at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1489)
at org.apache.spark.api.java.JavaSparkContext.broadcast(JavaSparkContext.scala:650)
at org.disq_bio.disq.impl.formats.bam.BamSink.save(BamSink.java:78)
at org.disq_bio.disq.HtsjdkReadsRddStorage.write(HtsjdkReadsRddStorage.java:227)
... 65 elided
This is a temporary workaround, to use Java serialization instead of Kryo
import com.esotericsoftware.kryo.serializers.JavaSerializer
...
kryo.register(classOf[htsjdk.samtools.SAMFileHeader], new JavaSerializer())
Closing as unable to reproduce
For benchmarking I've built a fat jar with ADAM and Disq, and there seems to be a problem with SAMRecord or SAMFileHeader serialization.
convert_parquet_alignments_disq_adam.scala
See also a similar issue https://github.com/bigdatagenomics/adam/issues/2186 reported in ADAM, when saving to BAM format from ADAM code. I fear a conflict in serialization registration or other incompatibilities between the two libraries' use of htsjdk.