Open lbergelson opened 8 years ago
According to collaborators this is caused by a bug running kryo on JDK8 and can be fixed either by:
I'll update this when I get more information.
Kryo issue tracking this here is here https://github.com/EsotericSoftware/kryo/issues/382
Temporary fix is to run with the following set:
spark.executor.extraJavaOptions –XX:hashCode=0
spark.driver.extraJavaOptions –XX:hashCode=0
alpha-2
@lbergelson is this still an issue? We're on Kryo 3.0.3 now
It's still an issue unfortunately. The changes we need are merged, but 3.0.3 was released nearly a year ago. I asked for an official new release https://github.com/EsotericSoftware/kryo/issues/431, we'll see if they respond. We could build our own version and publish it, but that seems like an unfortunate thing to have to do and I assume cloudera wouldn't incorporate it in their distribution.
Kryo 4.0.0 is released including the changes we need. Now we need to figure out how to get that into our clusters.
@lbergelson Are we using a version of Kryo with the fix?
We're not using the kryo version with the fix. We can open a ticket against spark to see if someone will update it, but I suspect it will be a long slog to get it in.
I happened to see the same problem and created an issue with spark here https://issues.apache.org/jira/browse/SPARK-20389
What is the status of resolving this bug? Is it fixed in GATK4.beta.2?
Not fixed yet, targeted for 4.0 general release.
Ok. Thanks for that.
FYI, Spark 2.4.0 upgraded to Kryo 4.0.0 in 3e033035. There does not appear to be a back port to older spark versions.
Did y'all happen to also try this:
Kryo kryo = new Kryo();
kryo.setReferences(false);
I'd like to avoid asking Hail users to set JVM options, so this approach is appealing to me. Curious if you all had experience trying it out.
Oh, that's great that it's fixed in 2.4.0 I think we'd sort of given up on this since we had a work around. I thinksetReferences(false)
should work, but it could potentially badly inflate the sizes of your serialized objects if you're using kyro's field serializer to serialize repetitive data I think.
We should see about upgrading to 2.4.0... It looks like there's a dataproc preview with it that we could use. @tomwhite Any thoughts about upgrading or not to 2.4.0?
We'd need to test GATK on a 2.4 cluster, but I don't see why we wouldn't upgrade. I would like to wait until 2.4.1 (should be out soon) as it enables testing for Java 11 too. See #5782
This should be fixed now that we're using Spark 2.4. Is anyone able to check?
We have multiple reports of
NegativeArraySizeException
being thrown during kryo serialization while runningReadsSparkPipeline
an example
resulted in