akka / akka

Build highly concurrent, distributed, and resilient message-driven applications on the JVM
https://akka.io
Other
13.05k stars 3.6k forks source link

FAILED lang.InternalError: a fault occurred in a recent unsafe memory access #21484

Closed johanandren closed 8 years ago

johanandren commented 8 years ago

https://jenkins.akka.io:8498/job/akka-artery-cluster-tests/63/consoleFull

[JVM-2] [ERROR] [09/16/2016 18:06:16.872] [UnreachableNodeJoinsAgainSpec-akka.actor.default-dispatcher-9] [akka.remote.artery.ArteryTransport(akka://UnreachableNodeJoinsAgainSpec)] Aeron error: 1 observations from 2016-09-16 18:06:16.204+0200 to 2016-09-16 18:06:16.204+0200 for:
[JVM-2]  .lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code
[JVM-2]     at io.aeron.driver.buffer.MappedRawLog.<init>(MappedRawLog.java:68)
[JVM-2]     at io.aeron.driver.buffer.RawLogFactory.newInstance(RawLogFactory.java:114)
[JVM-2]     at io.aeron.driver.buffer.RawLogFactory.newNetworkedImage(RawLogFactory.java:84)
[JVM-2]     at io.aeron.driver.DriverConductor.newPublicationImageLog(DriverConductor.java:665)
[JVM-2]     at io.aeron.driver.DriverConductor.onCreatePublicationImage(DriverConductor.java:228)
[JVM-2]     at io.aeron.driver.cmd.CreatePublicationImageCmd.execute(CreatePublicationImageCmd.java:62)
[JVM-2]     at io.aeron.driver.DriverConductor.onDriverConductorCmd(DriverConductor.java:924)
[JVM-2]     at io.aeron.driver.DriverConductor$$Lambda$14/738677855.accept(Unknown Source)
[JVM-2]     at org.agrona.concurrent.OneToOneConcurrentArrayQueue.drain(OneToOneConcurrentArrayQueue.java:106)
[JVM-2]     at io.aeron.driver.DriverConductor.doWork(DriverConductor.java:174)
[JVM-2]     at org.agrona.concurrent.AgentRunner.run(AgentRunner.java:122)
[JVM-2]     at java.lang.Thread.run(Thread.java:745)

Additionally /run/shm filled up on a04, aeron files not being removed I think.

johanandren commented 8 years ago

Realized now that @patriknw wrote about the same thing in the gitter channel before I posted this.

johanandren commented 8 years ago

The problem isn't this exception it is simply what happens in Aeron when the disk is full, the problem is the mapped files not getting deleted in the first place.

johanandren commented 8 years ago

I'll use this ticket for tracking the leak.

johanandren commented 8 years ago

JVM2 in akka.remote.ArteryRemoteNodeDeathWatchFast leaked the file in this build, nothing suspicious in the logs: https://jenkins.akka.io:8498/job/pr-validator-per-commit-jenkins/7449/consoleFull

drewhk commented 8 years ago

Isn't that one of the tests that exit the JVM?

johanandren commented 8 years ago

Closing as the ticket has been merged and no more file leaks (except for on JVM crashes, which hopefully also is solved and will be merged soon)