RestComm / Restcomm-Connect

The Open Source Cloud Communications Platform
http://www.restcomm.com/
GNU Affero General Public License v3.0
242 stars 215 forks source link

Concurrency issues on Gather scenario #1728

Closed hrosa closed 7 years ago

hrosa commented 7 years ago

Running load tests for Gather scenario unveils NPEs in media group actor:

11:38:05,550 ERROR [org.restcomm.connect.mscontrol.mms.MgcpMediaGroup] (RestComm-akka.actor.default-dispatcher-32) null: java.lang.NullPointerException at org.restcomm.connect.mscontrol.mms.MgcpMediaGroup.notification(MgcpMediaGroup.java:214) [restcomm-connect.mscontrol.mms-8.0.0.issue1526-rms6-load-test.jar:8.0.0.issue1526-rms6-load-test] at org.restcomm.connect.mscontrol.mms.MgcpMediaGroup.onReceive(MgcpMediaGroup.java:313) [restcomm-connect.mscontrol.mms-8.0.0.issue1526-rms6-load-test.jar:8.0.0.issue1526-rms6-load-test] at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:159) [akka-actor_2.10-2.1.2.jar:] at akka.actor.ActorCell.receiveMessage(ActorCell.scala:425) [akka-actor_2.10-2.1.2.jar:] at akka.actor.ActorCell.invoke(ActorCell.scala:386) [akka-actor_2.10-2.1.2.jar:] at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:230) [akka-actor_2.10-2.1.2.jar:] at akka.dispatch.Mailbox.run(Mailbox.scala:212) [akka-actor_2.10-2.1.2.jar:] at akka.dispatch.ForkJoinExecutorConfigurator$MailboxExecutionTask.exec(AbstractDispatcher.scala:506) [akka-actor_2.10-2.1.2.jar:] at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:262) [scala-library-2.10.1.jar:] at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:975) [scala-library-2.10.1.jar:] at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1478) [scala-library-2.10.1.jar:] at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104) [scala-library-2.10.1.jar:]

https://github.com/RestComm/Restcomm-Connect/blob/issue1526-rms6/restcomm/restcomm.mscontrol.mms/src/main/java/org/restcomm/connect/mscontrol/mms/MgcpMediaGroup.java#L214

Seems the media group is stopped (and originator cleaned) before the response comes from RMS.

Also there was one failed call during the test and it seems resources were not properly cleaned on RMS side as I see one UDP connection open on netstat.

Finally, netstat reveals a lot(!) of tcp connections in TIME_WAIT mode.

hrosa commented 7 years ago

Seems to have been fixed by #1614