weiyirong / imsdroid

Automatically exported from code.google.com/p/imsdroid
0 stars 0 forks source link

IMSDroid/doubango architectural limitations #35

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
This is not a defect report, more like a request for comments on the 
IMSDroid/doubango architecture.
First of all, I would like to explain the problem I am facing, then I will give 
couple possible solutions, and, after that, ask the questions that I already 
have.
Problem: I need to know the exact production limits of the IMSDroid application 
on several devices currently present on the market, and how many simultaneous 
conference conversations they are able to have. I mean, we can possibly handle 
one encoder/decoder pair for h.263 video and pcmu audio encoder/decoder pair, 
but what if a user makes one more connection to another sip server and makes a 
video call there, will there be enough CPU/mem/etc to handle that one as well? 
Solution: There are two possible ways to check that. One - to make changes to 
the core itself, creating dummy encoder/decoder pair for audio and video, and 
making calculations there just to eat the CPU time and memory and see how it 
affects the overall performance. Another way is to establish a double sip 
connection, make a double video call and check the performance.
Questions: Since the code does not compile in the given version ( see issue 30 
), I was only able to try the second way. Everything went smooth at the 
beginning - I've created another MyAvSession object, given another registry 
information, was able to register and make a call with a given interface 
without any problems, but all tries to setup a double video sending/receiving 
failed. It seemed like there could not be two instances of  VideoProducer 
class, as soon as I made the MyAvSession's VideoProducer non-static (it was 
required to have two separate sessions), I had faced the non-working program. 
Further studying of the code revealed its source - the  ProxyProducer.cxx code 
inside bindings/common library, which is used for libtinyWRAP component 
building. There, the method setActive, which is being used during 
Proxy(Video/Audio)(Producer/Consumer) initialization, a static variable 
instance is being used. It means, we cannot have two video producers or two 
video consumers with the given architecture, am I right? Does that mean we 
cannot process and show two separate video streams from the different sources 
at the same time unless we rewrite the code library itself, or I am just 
confused and missing something inside program's structure? Please, shed some 
light. Thank you in advance.

Original issue reported on code.google.com by volkov.r...@gmail.com on 18 Aug 2010 at 9:54

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
For the issue 30 I cannot see how it is possible to have link problem unless 
you are mixing static and shared libraries in the "output" folder.

Doubango: There is no limitation on the number of consumers, producers and 
sessions you can have at the same time. 
tinyWRAP: Has been developed in "quick-and-dirty" mode. The goal was to have 
basic proxy (jni<->native) functions to manage doubango libraries from 
IMSDroid. If you want to have full access to all features, then you should 
directly use doubango API or update the proxy functions. However, in you case 
the solution is easy.
You should:

===1===
Remove static instance of the producers and consumers which are:  
ProxyAudioProducer::instance, ProxyVideoProducer::instance, 
ProxyAudioConsumer::instance and ProxyVideoConsumer::instance. 

===2===
Create two new classes: MediaManager and MediaManagerCallback. MediaManager 
instance will replace the static instances you have removed in step 1. You can 
have something like this:
class MediaManager
{
public:
    MediaManager(MediaManagerCallback callback);
    virtual ~MediaManager();

public:
#if !defined(SWIG)
    static uint64_t requestNewId();

    void addProducer(ProxyProducer*);
    void addConsumer(ProxyConsumer*);
    void removeProducer(uint64_t id);
    void removeConsumer(uint64_t id);

    static MediaManager* instance;
    static MediaManager* getInstance();
#endif

private:
    MediaManagerCallback callback;

    tsk_list_t* consumers;
    tsk_list_t* producers;
};

MediaManagerCallback will inform us when the media layer (tinyMEDIA) 
create/destroy new consumers or producers. You can have something like this:
class MediaManagerCallback
{
public:
    MediaManagerCallback() {  }
    virtual ~MediaManagerCallback() {}

    // New consumer object will be created unless you return a non-zero code
    virtual int OnConsumerCreated(uint64_t id, twrap_media_type_t type) { return -1; }
    // New producer object will be created unless you return a non-zero code
    virtual int OnProducerCreated(uint64_t id, twrap_media_type_t type) { return -1; }

    virtual int OnConsumerDestroyed(uint64_t id) { return -1; }
    virtual int OnProducerDestroyed(uint64_t id) { return -1; }
};

===3===
Change twrap_producer_proxy_audio_t, twrap_producer_proxy_video_t, 
twrap_consumer_proxy_audio_t and twrap_consumer_proxy_video_t structures by 
adding an id field. Example:

typedef struct twrap_producer_proxy_audio_s
{
    TDAV_DECLARE_PRODUCER_AUDIO;

    tsk_bool_t started;
    uint64_t id;
}
twrap_producer_proxy_audio_t;

===4===
In the java code, remove all static declaration of the audio/video 
consumers/producers and create a static MediaManager object with its callback 
function.
For example, each time the media layer (tinyMEDIA) create a new audio producer, 
"twrap_producer_proxy_audio_ctor" function will be called and 
"twrap_producer_proxy_audio_dtor" when destroyed. This is true both consumers 
and producers. In each constructor use "MediaManager::requestNewId()" to get a 
unique id and call "MediaManagerCallback::OnConsumerCreated()" or 
"MediaManagerCallback::OnProducerCreated" to alert the java world that a new 
object have been created.You should also call "MediaManager::addProducer()" or 
"MediaManager::addConsumer()" to take a reference to the object. 
For example, when an audio producer is started, 
"twrap_producer_proxy_audio_start(tmedia_producer_t* self)" function is called. 
Cast the parameter as "twrap_producer_proxy_audio_t" to get the id of the 
producer then look into the MediaManager to get a proxy producer with the same 
id. Once you have the ProxyAudioProducer object, call 
ProxyAudioProducer::start() callback to alert the java world that the producer 
has been started. Do the same for all other consumes and producers.
If you really want support for multiple audio/video sessions I can change the 
code.  
Off course only one session will be able to use the camera at the same time.

One fullscreen video call using H263 + PCMA/PCMU take at least 48% of the CPU 
on an old Android 1.5 (528MHz). At least 20% of the CPU is used by yuv2rgb 
functions from libswscale. This is very high because there is no ARM 
optimization in this library as we have in libavcodec and libavutil. This can 
be divided by 3 if we use ARM Assembler code from theorarm 
(http://wss.co.uk/pinknoise/theorarm/). I'm working on the issue for an 
IPTV/VoD solution. Off course the final implementation will be part of IMSDoid 
;)

Original comment by boss...@yahoo.fr on 19 Aug 2010 at 12:44

GoogleCodeExporter commented 9 years ago
Thanks for the information. No need to do it yourself, save your precious time 
and spend it on something more useful for everyone, I'll implement these 
changes myself in the nearest future, as soon as I have the promised prototype 
device to test everything on. I will keep you informed on the progress.

Original comment by volkov.r...@gmail.com on 19 Aug 2010 at 2:16

GoogleCodeExporter commented 9 years ago
 First of all, terminology. Asterisk (*) means either producer or consumer.
 Now, the problem -
 I have implemented some of the features listed here, but still I need your help on the subject. Attached is the code I've implemented, in case you need some more information. Anyway, lets consider the following scenario - I am using the new architecture. I have SWIGged out the MediaManager and MediaManagerCallback classes(attached, as well as their C++ parts), so I override the callback class methods in the child class, MyMediaManagerCallback(attached). Whenever twrap_proxy_*_ctor method is called, I call On*Created method of the callback but I always end up seeing TSK_DEBUG_ERROR message from the original C class, like my child class is never called - only parent. This is problem number one. 
 Another problem is in Proxy*.cxx code methods called twrap_proxy_*_prepare(they are incorrect at the moment). When we had a singleton Proxy*::instance, we just checked if its NULL, and if its not, we did the take* call, which attached the producer or consumer to its Proxy* instance. Now, we have several instances of Proxy* classes, and we have several calls to twrap_proxy_*_prepare, so we actually do not know to which particular Proxy* we need to attach our consumer or producer structure. Yes, probably we could have known that if the callbacks worked, did you create them exactly for that purpose? If so, how could I make them work? Please, help. I hope you have some time to check the code, there isn't much anyway

Original comment by volkov.r...@gmail.com on 16 Sep 2010 at 4:10

Attachments:

GoogleCodeExporter commented 9 years ago
As you haven't published the SWIG interfaces I cannot figure out why the 
callback is not called.
Just make sure that you have registered your callback as SWIG directory feature 
(see SipStack.i for example). Also don't forget to change "autogen.sh" to 
automatically patch the generated file for Android OS.
Please add "%include <stdint.i>" in your interface file to avoid creating SWIG 
types (SWIGTYPE_p_uint64_t).

Original comment by boss...@yahoo.fr on 17 Sep 2010 at 12:04

GoogleCodeExporter commented 9 years ago
Sending you the requested SWIG interface and autogen file, everything has been 
patched in the same fashion like your stuff, as you see in the above letter, 
the java interfaces of MediaManager and MediaManagerCallback are created 
succesfully. %include will be added, but it cannot be the reason of not calling 
callback, right?

Original comment by volkov.r...@gmail.com on 17 Sep 2010 at 3:49

Attachments:

GoogleCodeExporter commented 9 years ago
You MUST use pointers to pass callback function to the native code from java.
You should have:
{{{
MediaManager(MediaManagerCallback* callback);
}}}
Instead of:
{{{
MediaManager(MediaManagerCallback callback);
}}}
Attached the common directory I use for my tests. And all work as expected. I'm 
using C#.NET but java should work.

Original comment by boss...@yahoo.fr on 18 Sep 2010 at 12:01

Attachments:

GoogleCodeExporter commented 9 years ago
I was surprised to see MediaManagerCallback instead of its pointer in your 
original code, but thought it must be some special case and just implemented 
what you said. I should have asked when I had the suspicions, and yes, current 
code you gave me makes callbacks work, that means I can continue the 
implementation of my testing stuff. Thanks for help! You saved a lot of time, 
again.

Original comment by volkov.r...@gmail.com on 20 Sep 2010 at 9:19

GoogleCodeExporter commented 9 years ago
The code in comment 2 doesn't use SWIG and it's written for Windows Phone 7 
(using C# delegates). WP7 mandates unicode for marshaling.
Do you know if there is a simple way to add support for unicode in SWIG?

Original comment by boss...@yahoo.fr on 21 Sep 2010 at 5:15

GoogleCodeExporter commented 9 years ago
No, unfortunately this is the first project I've worked with that uses SWIG, so 
I am definitely less informed than you on the subject. BTW, I added that 
include you suggested to get rid of SWIGTYPE_p_uint64_t usage, they were 
replaced with java.math.BigInteger, unfortunately, library stopped working 
right after showing Java.Lang.NosuchMethodError right after. It took me many 
hours to realize what was the reason since I also added some code myself, and 
thought it was my problem. As soon as I commented out that include the library 
started working again. Some signatures misinterpretation or incompatibility, I 
think.

Original comment by volkov.r...@gmail.com on 21 Sep 2010 at 2:06

GoogleCodeExporter commented 9 years ago
MsrpMessage::getByteRange(int64_t,int64_t,int64_t) works fine. Perhaps there is 
a problem on the unsigned type. 
int64_t should be marshaled as long and uint64_t as BigInt. Very strange.

Original comment by boss...@yahoo.fr on 21 Sep 2010 at 2:18

GoogleCodeExporter commented 9 years ago
Continuing the research, I have found one more limitation/singleton 
possibility. I have created two SipService instances, and made them work in 
parallel. They managed to register together fine, but as soon as I made 
simultaneous SIP call, the problems appeared - I started receiving the 
following error:

09-22 15:56:53.851: WARN/tinyWRAP(4822): **WARN: function: "tsk_fsm_act()" 
09-22 15:56:53.851: WARN/tinyWRAP(4822): file: "src/tsk_fsm.c" 
09-22 15:56:53.851: WARN/tinyWRAP(4822): line: "190" 
09-22 15:56:53.851: WARN/tinyWRAP(4822): MSG: State machine: No matching state 
found.

One of theories was this was due to single state machine being used with both 
instances of SipService, and the debug info in tsip_dialog_invite_client_init 
showed that this function is being called only once, even if I make two calls. 
If you know why, please, shed some light for me to save a lot of time on 
research. A little hint on how to fix this would be also greatly appreciated. 
Thanks in advance.

Original comment by volkov.r...@gmail.com on 22 Sep 2010 at 4:09

GoogleCodeExporter commented 9 years ago
There is no relation between the warning message and the SipService. The actual 
code allow simultaneous SIP call. The only issue is the Audio/Video 
producers/consumers singletons.
You don't need two SipService instances to implement Video Conferencing. If you 
want to register more than one identity, then just create as much as 
MyRegistrationSession objects.
The state machine instances (tsk_fsm_t) are per SIP dialog. In most cases you 
can safely ignore this warning.

Original comment by boss...@yahoo.fr on 22 Sep 2010 at 4:50

GoogleCodeExporter commented 9 years ago
I am sorry, either me or you might be missing something. I am not a guru of the 
imsdroid code, so let's suppose it's me who's wrong, but here are the questions 
that I have after your response:
1. You say I don't need two SipService classes to implement more than one 
identity, but how do you imagine the work of your library when two simultaneous 
calls are made? Am I supposed to call CallSession_callAudioVideo twice with the 
same state machine? I am confused.
2. Yes, a SIP dialog is not closely related to SipService, but I have two of 
those, I make two different CallSession_callAudioVideo calls from each one, and 
somehow only one Dialog init function is being called. And I see that there is 
an unexpected state in a state machine. From those two facts I suppose that 
there might be only one state machine no matter how many dialogs are created, 
is this possible? Because with one client it is not noticeable - you have no 
simultaneous work of the dialogs, you register, THEN make a call, THEN hang up 
etc. So I asked you if that is possible, two SipService instances are just in 
case, it's safer that way.

Original comment by volkov.r...@gmail.com on 23 Sep 2010 at 7:24

GoogleCodeExporter commented 9 years ago
I re-checked all information and it appears this was my mistake - I mean 
2.(dialog not being created), as for 1.(One SipSession for two calls) I am 
still confused, thanks for your time.

Original comment by volkov.r...@gmail.com on 23 Sep 2010 at 10:58

GoogleCodeExporter commented 9 years ago
Off course you cannot (SIP violation and doesn't follow any IETF/3GPP 
specification).
- To call "bob" then "alice": 1)create a sip session 2)call bob 3)hangup 
4)reuse the same session to call "alice"
- To call "bob" and "alice" at the same time: 1) create two sessions and call 
them

To implement video conferencing you MUST use a single sip session and the 
participants must be embedded in the INVITE body using "multipart/mixed" (or 
related) content type. To add/remove a participant just send a REFER message.
If you look in the doubango source code you will see that there is an empty 
file named "tsip_dialog_invite.conf.c". This file will contain MMTel CONF 
implementation as per 3GPP TS 24.147.

An example of SIP INVITE for video conference:

Content-Type: multipart/mixed;boundary="boundary1"
Content-Length: (…)
--boundary1
Content-Type: application/sdp
v=0
o=- 2987933615 2987933615 IN IP6 5555::aaa:bbb:ccc:ddd
s=-
c=IN IP6 5555::aaa:bbb:ccc:ddd
t=0 0
m=audio 3456 RTP/AVP 0
m=video 6060 RTP/AVP 126
a=rtpmap:126 theora/90000\r\n
--boundary1
Content-Type: application/resource-lists+xml
Content-Disposition: recipient-list
<?xml version="1.0" encoding="UTF-8"?>
<resource-lists xmlns="urn:ietf:params:xml:ns:resource-lists"
xmlns:cp="urn:ietf:params:xml:ns:copycontrol">
<list>
<entry uri="sip:alice@doubango.org" />
<entry uri="sip:bob@doubango.org"/>
</list>
</resource-lists>
--boundary1--

Original comment by boss...@yahoo.fr on 23 Sep 2010 at 12:38

GoogleCodeExporter commented 9 years ago
To monitor the conference session, SUBSCRIBE to "conference" event package as 
per RFC 4575(http://www.ietf.org/rfc/rfc4575.txt).

Original comment by boss...@yahoo.fr on 23 Sep 2010 at 12:42

GoogleCodeExporter commented 9 years ago
Thanks for your information but the case is, my initial target was not to 
implement a video conference itself, but more to measure CPU memory and other 
load during the decoding/encoding process to understand if implementation of 
such thing is economically justified for the current generation devices. And 
for that to take place I decided to slightly modify your code, to make two 
registrations instead of one, and place two simultaneous calls instead of one 
and see how program/library manages to decode two streams at once. I had some 
success implemented your multi-producer/consumer architecture and fixed all the 
bugs and may share the results - at this current moment it cannot decode two 
streams at once, unfortunately, I am getting the progressive delay and CPU load 
around 92%-102% according to top, currently researching the possibilities to 
shorten the decode time, as you said, first thing to see is YUV transcoding 
optimization. Did you have any success porting the theorarm?

Original comment by volkov.r...@gmail.com on 4 Oct 2010 at 8:46

GoogleCodeExporter commented 9 years ago
And again my mistake - after going through profiler I have found some 
architecture mistakes in consumers, after fixing which the 2-screen video 
worked ok with 30-38% CPU load on Galaxy S, a pretty good result I must say.

Original comment by volkov.r...@gmail.com on 4 Oct 2010 at 1:22

GoogleCodeExporter commented 9 years ago
Hi,
About the CPU usage, you can compare "doubango" with other mobile videophones 
(e.g. fring) but you will end with the same result ;) There is an issue (delay) 
for CIF video but this is caused by Android media system.
For the video conferencing, I have worked with many companies (cisco, colibria, 
...) and they all use an Ad-Hoc conference factory. This means that event if 
you are making a conference with 10 participant (or more) the mobile endpoint 
will only encode/decode one video stream. All ten participants will 
send/receive RTP packets to/from the server. It's up to the server to mix the 
video stream (mosaic).

Even on a 4GHz PC you cannot make a video conferencing with 10 participants (or 
more) and this is why these systems are based on an Ad-Hoc factory.
As both Colibria and Cisco implementation are not open you can check CONFIANCE 
(http://confiance.sourceforge.net/).

Original comment by boss...@yahoo.fr on 4 Oct 2010 at 1:51

GoogleCodeExporter commented 9 years ago
Hi Volkov.r..,
   Would you mind sharing the final code, with which you got the performance numbers? Thx

Original comment by sand...@yahoo.com on 4 Mar 2011 at 11:22

GoogleCodeExporter commented 9 years ago
Would you mind pointing out the architectural mistakes in consumers, as in my 
testing on a 600MHz processor, from IMSDroid, I find H263 videos @ CIF going 
out at less than 8 fps, and H264 videos going out at less than 4 fps, when 
running without a decoder. And when I run a 3-party call by placing the first 
receiver on hold, & making a second call to 2nd receiver, the performance was 
even shoddier, even though there was only 1-encode going on at any given time.
    I think the whole community could benefit from the performance improvements you've identified.

Original comment by sand...@yahoo.com on 7 Mar 2011 at 6:55

GoogleCodeExporter commented 9 years ago
@sand...@yahoo.com
It's nearly impossible to have smooth CIF (>15fps) video on AMRv5TE device. 
Only QCIF video will work without delay. For CIF or higher you should use an 
ARMv7-a device (Samsung Galaxy S, HTC Hero, ...).
Here is the patch from "volkov.r...@gmail.com": 
http://code.google.com/p/doubango/issues/detail?id=21

Original comment by boss...@yahoo.fr on 7 Mar 2011 at 7:52

GoogleCodeExporter commented 9 years ago
I installed Open IMS core server in my Linux machine and I installed the 
application on my phone imsdroid. Unfortunately, I have not been able to 
establish the connection between the server and imsdroid OpenIMS core. what can 
I do to connect? thank you

Original comment by sghaier....@gmail.com on 7 Feb 2013 at 5:42