bensari / mediaserver

Automatically exported from code.google.com/p/mediaserver
1 stars 0 forks source link

JSR-309 driver leaks memory #133

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Use a SIP servlet application to play announcements via the JSR-309 driver.
2. Make some calls.
3. The MediaSession object and all it's descendants remain in memory and are 
not garbage collected.

What is the expected output? What do you see instead?
MediaSessions that have been released should be de-referenced and eventually 
garbage collected.
They are held by references in two places:
  1) Through the Player and Recorder FSM threads - which are not cancelled.
  2) Through the Driver requestListeners hash map - which is not cleared up.

What version of the product are you using? On what operating system?
Seen with 3.0.0.Final. Reproduced and patched in 3.0.1-SNAPSHOT from git.

Please provide any additional information below.

I've done a local patch, which seems to fix the problem for me. This involves:
  1) RecorderImpl 
    Add a RELEASE signal and an INVALID state.
    Add a release method to send the RELEASE signal.

    PlayerImpl
    Add a release method to send the RELEASE signal.

    MediaGroupImpl
    Call player.release() and recorder.release() from release method

  2) DriverImpl
     In the processMgcpCommandEvent method, if a match is found in requestListeners, remove it once the handler has been called.

I don't know if this patch will have wider implications. I will prepare a patch 
file and attach it.

Original issue reported on code.google.com by andrew.m...@acision.com on 29 Nov 2013 at 4:21

GoogleCodeExporter commented 9 years ago
Patch 001 is a small improvement to diags that I found useful.
Patch 002 is the fix to cancel the FSM timers.
Patch 003 is the fix to clean requestListeners.

Original comment by andrew.m...@acision.com on 29 Nov 2013 at 4:33

Attachments:

GoogleCodeExporter commented 9 years ago
Hi Andrew,
My problem is exactly what you said.
I used yours patch-files but this problem doesn't disappear, memory remain high.

Please show me some suggestions. 
Thank you!

Original comment by ltbinh12...@gmail.com on 15 Apr 2014 at 4:55

GoogleCodeExporter commented 9 years ago
Hi, we have exactly the same problem too.
On 3.0.0.FINAL.
I think, Andrew's patches are fixing some of these problems, but not all of 
them.

Original comment by alpercos...@gmail.com on 15 Apr 2014 at 6:29

GoogleCodeExporter commented 9 years ago
I'm guessing that you are using components that I didn't. I had just a Player 
and a Recorder.

I'm not currently working on the Media Server, as that project is currently not 
proceeding; so I won't be able to do any further work on this at the moment.

As a tip, I found these leaks by connecting jConsole to the running MSS, and 
seeing what objects remained when the calls had ended. Then you just need to 
track down where these are created, and work out where they need to be released.

I would imagine that the others leaks follow a similar pattern to the ones I 
plugged.

Good luck !

Original comment by andrew.m...@acision.com on 15 Apr 2014 at 10:01

GoogleCodeExporter commented 9 years ago
Hi Andrew,
Yes, at mobicets-jsr309-example, they use SignalDetector to play media-file. 
And, your third patch-file (In the processMgcpCommandEvent method, if a match 
is found in requestListeners, remove it once the handler has been called.) but 
function processMgcpCommandEvent didn't called. It make request didn't removed.

I 've make some changes and testing. If it works, i'll feedback.

Original comment by ltbinh12...@gmail.com on 15 Apr 2014 at 10:46

GoogleCodeExporter commented 9 years ago
Hi, I have facing the same issue, patched with andrew's solutions, and fixed 
additional memory and thread leaks issues.

Basically I try to clean everything on release, including some forever running 
threads.
Some codes might not be relevant, but I did memory and thread profiling before 
and after the code changes and happy with the memory and thread load at the end.

Patch:
1. Applying Andrew's patch, adding additional release methods to clear 
resource, add timeout on thread await(). (Some threads didn't terminate 
properly....)

2. Making changes on ConcurrentCyclicFIFO.java, change to await(time, unit), to 
fix thread leak problems.

3. Add release transition.

4. Code refactor.

5. Fixed thread leak issue.

6. Fixed thread leak issue. To prevent unnecessary thread being created.

7-14. Code refactor.

Original comment by waileong...@gmail.com on 16 Apr 2014 at 2:28

Attachments:

GoogleCodeExporter commented 9 years ago
Hi Waileong,

Thank you.
My changes seem not effect deeply. I 'll try yours.

Original comment by ltbinh12...@gmail.com on 16 Apr 2014 at 7:56

GoogleCodeExporter commented 9 years ago
Hi Waileong,
Thank you
We applied the patch and below is our observation
1) There is an avg ~10-12secs delay after calling this api 
NC.getSdpPortManager().processSdpOffer(sdpOffer);
scenario: initiate call from softphone to sip servlet app & the sip servlet app 
answers the call.
Note:we faced the same type of delay during call initiated by the sip servlet 
app.
2) MSControl exception occurred when we tried to play.

Background:At present we are experiencing 
org.mobicents.fsm-Transition/State/FSM 
being retained in heap and found around 150 instances of Transition being in 
heap for our sample call scenario.
Applied yours and Andrew patch
#1) After applying your patch(which includes Andrews) we did see the transition 
objects get cleared, but state(3 objects) and fsm(few objects) remains in 
heap.But the play functionality get effected for our test call scenario;

Are we missing something here, do you have any comments...
Thanks & Regards

Original comment by msip123...@gmail.com on 15 Sep 2014 at 6:54

GoogleCodeExporter commented 9 years ago
Hi,

   I did fixed some bugs and re-factor code after the patches, but I not sure is this related to your issues. 
   I think I just post the my git diff since last patches, and you can check and verify. Hope this could help.

Original comment by waileong...@gmail.com on 17 Sep 2014 at 3:38

Attachments:

GoogleCodeExporter commented 9 years ago
Hi Waileong,
Thank you
We applied this patch & its analysis
1.From softphone make call to sip servlet app, INVITE/200ok/ACK....BYE/200ok 
done.
2.start jvisualvm before making the test call and filter for transition objects.
3.The NetworkConnectionImpl objects still in heap
4.If the call had any play prompt then 80 objects remains in heap
5.In the sip servlet app the MediaSession,Networkconnection, attributes from 
session are all released.
6.The NC state transition flow  - 
NULL/OPENING/HALF_OPEN/MODIFYING/OPEN/..../CLOSING/CLOSING/INVALID
7.But still the transition objects not released from memory.
8.I was facing this issue for past few weeks & having analysis/observations 
about the flow & findings.Thought to share & get some insights.
Regards & Thanks,

Original comment by msip123...@gmail.com on 17 Sep 2014 at 8:38

GoogleCodeExporter commented 9 years ago
Hi,
 Did you release media group and media session upon sip session invalidation (javax.servlet.sip.SipSessionBindingListener#valueUnbound) ?

javax.media.mscontrol.mediagroup.MediaGroup#stop()
javax.media.mscontrol.MediaSession#release()

Original comment by waileong...@gmail.com on 18 Sep 2014 at 2:35

GoogleCodeExporter commented 9 years ago
Hi Waileong,
Yes, we did release media session before sending 200ok for BYE;
1)To identify the memory leak, made a test call & drop from phone without any 
play.
The MediaGroup is not created in this scenario.
At the end of the call, after perform GC, the below objects retained in heap...
-org.mobicents.fsm.Transition=24 instances
-org.mobicents.fsm.State=9 instances
-org.mobicents.fsm.FSM=1 instance

2)-The observation on your patch--#6..APR'15 2014--for the above call scenario 
shows
-org.mobicents.fsm.State=3 instances
-org.mobicents.fsm.FSM=1 instance
--we observed delay before sending 200ok & got exception in while playing...
3)coming back to the point-1)...if we make 10 calls then
-org.mobicents.fsm.Transition=240 instances
-org.mobicents.fsm.State=90 instances..
-org.mobicents.fsm.FSM=10 instance..
after doing perform Gc the Transition comes down to 24, state=9, FSM=1

The fsm Transition objects created in NetworkConnectionImpl still remains in 
heap.
Do you see this type of behavior for this test call, also tried with jsr309 
sample PlayerServlet.java and found the same behavior.
4) How to check SipSessionBindingListener?

Regards & Thanks

Original comment by msip123...@gmail.com on 18 Sep 2014 at 8:08

GoogleCodeExporter commented 9 years ago
Hi Waileong,
can you pl tell me how to register for 
javax.servlet.sip.SipSessionBindingListener#valueUnbound & valuebound

Thanks & Regards

Original comment by msip123...@gmail.com on 22 Sep 2014 at 8:18

GoogleCodeExporter commented 9 years ago
Example:

sipSession.setAttribute("MediaSessionReleaseCallback", new 
SipSessionBindingListener() {
                @Override
                public void valueBound(SipSessionBindingEvent event) {
                }
                @Override
                public void valueUnbound(SipSessionBindingEvent event) {
                    mediaSession.release();
                }
            });

Original comment by waileong...@gmail.com on 24 Sep 2014 at 3:33

GoogleCodeExporter commented 9 years ago
Hi,
I'm working with MSS 3.0.0 Final & MMS 3.0.0 Final over JBoss 7 and jdk 7.

Sometimes with heavy traffic appears the following warning:

2015-01-09 08:36:39,294 
[org.mobicents.servlet.sip.core.session.SipApplicationSessionImpl.acquire] 
[1303] WARN - Failed to acquire session semaphore 
java.util.concurrent.Semaphore@2ee80265[Permits = 0] for 30 secs. We will 
unlock the semaphore no matter what because the transaction is about to 
timeout. THIS MIGHT ALSO BE CONCURRENCY CONTROL RISK. app Session 
is1c59e57c-a4f8-4e21-8dd1-c855ccbb2766;Click2CallApplication

At this time, the SIPClientTransactionImpl and SIPServerTransactionImpl objects 
starts to grow up to 20000 instances in a day.

Seems that some threads blocks with the following trace (from JConsole):

Stack trace: 
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Abst
ractQueuedSynchronizer.java:2043)
org.mobicents.javax.media.mscontrol.mediagroup.PlayerImpl.play(PlayerImpl.java:2
08)
org.mobicents.javax.media.mscontrol.mediagroup.PlayerImpl.play(PlayerImpl.java:2
18)
enel.indexbook.adapter.sip.Click2CallSipServlet.doProvisionalResponse(Click2Call
SipServlet.java:414)
javax.servlet.sip.SipServlet.doResponse(SipServlet.java:271)
javax.servlet.sip.SipServlet.service(SipServlet.java:334)
org.mobicents.servlet.sip.core.dispatchers.MessageDispatcher.callServlet(Message
Dispatcher.java:458)
org.mobicents.servlet.sip.core.dispatchers.ResponseDispatcher$1.dispatch(Respons
eDispatcher.java:479)
org.mobicents.servlet.sip.core.dispatchers.DispatchTask.dispatchAndHandleExcepti
ons(DispatchTask.java:61)
org.mobicents.servlet.sip.core.dispatchers.ResponseDispatcher.dispatchMessage(Re
sponseDispatcher.java:512)
org.mobicents.servlet.sip.core.SipApplicationDispatcherImpl.processResponse(SipA
pplicationDispatcherImpl.java:1005)
gov.nist.javax.sip.EventScanner.deliverEvent(EventScanner.java:296)
gov.nist.javax.sip.SipProviderImpl.handleEvent(SipProviderImpl.java:185)
gov.nist.javax.sip.DialogFilter.processResponse(DialogFilter.java:1501)
gov.nist.javax.sip.stack.SIPClientTransactionImpl.inviteClientTransaction(SIPCli
entTransactionImpl.java:889)
gov.nist.javax.sip.stack.SIPClientTransactionImpl.processResponse(SIPClientTrans
actionImpl.java:532)
   - locked gov.nist.javax.sip.stack.SIPClientTransactionImpl@607466db
gov.nist.javax.sip.stack.SIPClientTransactionImpl.processResponse(SIPClientTrans
actionImpl.java:1604)
gov.nist.javax.sip.stack.UDPMessageChannel.processMessage(UDPMessageChannel.java
:603)
gov.nist.javax.sip.stack.UDPMessageChannel.processIncomingDataPacket(UDPMessageC
hannel.java:512)
gov.nist.javax.sip.stack.UDPMessageChannel.run(UDPMessageChannel.java:317)
java.lang.Thread.run(Thread.java:745)

The snippet of the code is the following (line 414 is the play statement):

MediaGroup mediaGroup = (MediaGroup) 
linkedSipSession.getAttribute("MEDIA_GROUP");
try {
    Parameters options = mediaGroup.createParameters();
    options.put(Player.REPEAT_COUNT, 4);
    mediaGroup.getPlayer().play(URI.create("http://" + System.getProperty("jboss.bind.address", "127.0.0.1") + ":8080/IndexBook/ring.wav"), null, options);
} catch (MsControlException e) {
    // Click2CallSipServlet.logger.info("MsControlException: " + e.getMessage());
    e.printStackTrace();
}

The first invite (is a click2call application) is sent from an http servlet 
asynchronously with:

((SipSessionsUtilExt) 
this.sipSessionsUtil).scheduleAsynchronousWork(sipApplicationSession.getId(), 
new SipApplicationSessionAsynchronousWork() { ...

I suppose the problem is within PlayerImpl, perhaps linked with the one 
described within this thread.
Have you some ideas about the cause of the problem or if there is a patch that 
can be used?

Thanks in advance & Regards

Original comment by frankcaz...@gmail.com on 9 Jan 2015 at 10:55

GoogleCodeExporter commented 9 years ago
In continuation to the above posts,

After several attempts We have done some changes, which fixed the memory leak 
issues for our call flows.
All these changes have been made in the jsr-309-driver-3.0.1-SNAPSHOT.jar only 
and they are listed below: (and also attached for reference)

1) MediaGroupImpl.java
    Called player.release(), recorder.release() and detector.release() from release method  

2)  PlayerImpl.java 
    Nullified fsm object, clearing listeners and playList from release method       

3) RecorderImpl.java
    Nullified fsm object, clearing listeners and triggers from release method   

4) ContainerImpl.java
   Commented unused list object from the unjoin method

5) SignalDetectorImpl.java 
    Cleared listeners and triggers from the release method

6) DeleteConnectionResponseHandler.java
    Added null check for connection and fsm objects to processMgcpResponseEvent method 

7) NetworkConnectionImpl.java 
    Nullified fsm object from release method

8) MediaSessionImpl.java
    Nullified mixers, groups ,connections and attributes objects from release method

9) DriverImpl: when the jsr309 api get called, the requestidentifier gets added 
in the concurrenthashmap but not removed once the response was received.
path:[sip-app]---(jsr309-api)------[jsr309.jar]-----[mgcp]----[mgcp..MMS]
   ==>Removed RequestIdentifier from requestListeners once the handler has been called inside the processMgcpCommandEvent method.

Key points:
-----------
1.The above fixes may or maynot work for the type of call scenario that you are 
running.
2.But keep the references in hand, and tools like jvisualvm for identifying and 
fixing the memory leak.
3.Break down the steps into smaller one and try to identify at which step the 
memory leak was occuring.
4.Once you fixed, try to run a small trial run and see the performance and 
increase the load step by step.
5.While fixing kindly ensure other functionalities are not effected.
6.If you are performing load tests with less number of CPU/RAM, then sometimes 
the error response "No response from server during 5000 ms" will be throwed. To 
avoid this behaviour, increase the CPU/RAM and allocate more memory to 
MSS/JBOSS.
7.Set the log level to ERROR for MSS, MMS, and the sip application

Original comment by msip123...@gmail.com on 20 Jul 2015 at 11:47

Attachments:

GoogleCodeExporter commented 9 years ago
Hi, 

It looks like the Media server has moved to GitHub, ahead to the scheduled 
decommissioning of Google Code.

Looking at https://github.com/Mobicents/mediaserver I don't think this issue 
has been addressed in the current tip, nor can I find an issue for it.

I'm not working on this now, but if this is still an active issue for others, 
I'd suggest you raise an issue on 
https://github.com/Mobicents/mediaserver/issues

Original comment by andrew.m...@acision.com on 20 Jul 2015 at 1:25

GoogleCodeExporter commented 9 years ago
Andrew is right. Thanks all for reporting this. the related issue is located at 
https://github.com/Mobicents/mediaserver/issues/21

Original comment by jean.deruelle on 20 Jul 2015 at 1:36

GoogleCodeExporter commented 9 years ago
Hello, 

Can you please open an issue on https://github.com/Mobicents/mediaserver and do 
a pull request so I can review your work?

Thank you

Original comment by henrique...@telestax.com on 20 Jul 2015 at 2:14