sipcapture / homer-app

HOMER 7.x Front-End and API Server
GNU Affero General Public License v3.0
207 stars 85 forks source link

QoS tabsence for all call flows (with media) - how to associate RTCP with calls? #342

Closed systemcrash closed 4 years ago

systemcrash commented 4 years ago

Pulled in 'yesterdays' build of webapp. Reviewing any call flow produces the web browser console error:

error Object { data: "", message: "no agent subscription object found for type [cdr]" }

More specifically:

ngOnInit http://thishost/main.5be002036effe68df150.js:1
    invoke http://thishost/polyfills.635d703039731e52383d.js:1
    onInvoke http://thishost/main.5be002036effe68df150.js:1
    invoke http://thishost/polyfills.635d703039731e52383d.js:1
    run http://thishost/polyfills.635d703039731e52383d.js:1
    O http://thishost/polyfills.635d703039731e52383d.js:1
    invokeTask http://thishost/polyfills.635d703039731e52383d.js:1
    onInvokeTask http://thishost/main.5be002036effe68df150.js:1
    invokeTask http://thishost/polyfills.635d703039731e52383d.js:1
    runTask http://thishost/polyfills.635d703039731e52383d.js:1
    y http://thishost/polyfills.635d703039731e52383d.js:1
    invokeTask http://thishost/polyfills.635d703039731e52383d.js:1
    f http://thishost/polyfills.635d703039731e52383d.js:1
    p http://thishost/polyfills.635d703039731e52383d.js:1

But if I look in settings -> Agentsub, sure enough, there is a GUID row, albeit which has expired.

I can click the red garbage icon, but I'm concerned that I will not be able to undo anything. There is no indication I can regain an Agentsub - whatever that is. Does this happen automatically at some connect or startup?

adubovikov commented 4 years ago

this is HEPSUB external subscription. If you don't have it - ignore it :)

systemcrash commented 4 years ago

Ah, because clicking delete shows, BIG FAT RED WARNING:

Are you sure?

Your will not be able to recover this!
systemcrash commented 4 years ago

So... what (verifiable) "settings" are necessary so I get a QoS tab on call flows? ( Ours seems to have disappeared )

lmangani commented 4 years ago

Disappeared? Is this confirmed from calls with known statistics and/or are there any client side errors while opening those sessions?

systemcrash commented 4 years ago

Looking in Grafana, I see the Jitter graph fill nicely - but I open that call in the graph in Homer. No tab. What client side (web browser?) errors should I look out for, aside from the ones I mentioned at the start?

adubovikov commented 4 years ago

do you have rtcp messages in your PG DB ?

systemcrash commented 4 years ago

How does one confirm this?

adubovikov commented 4 years ago

select * from hep_proto_5_default

systemcrash commented 4 years ago

So, after performing a slightly more complex command of:

# docker-compose exec db psql -h localhost -U root
\c homer_data
You are now connected to database "homer_data" as user "root".
(367 rows)

select * from hep_proto_5_default;
 17526 | 662d7428-51a08047bcf-381e3f81 | 2020-05-13 09:04:33.350507+00 | {"dstIp": "2001:*************", "srcIp": "2001:***********", "dstPort": 58081, "srcPort": 61343, "protocol": 17, "captureId": "1", "payloadType": 5, "timeSeconds": 1589360673, "timeUseconds": 350507, "correlation_id": "662d7428-51a08047bcf-381e3f81", "protocolFamily": 10} | {"node": "1", "proto": "rtcp"} | {"report_count":1,"type":200,"ssrc":2824075690,"sender_information":{"ntp_timestamp_sec":20050,"ntp_timestamp_usec":657129472,"rtp_timestamp":1591042469,"packets":94802,"octets":15168320},"report_blocks":[{"source_ssrc":3033880513,"fraction_lost":0,"packets_lost":0,"highest_seq_no":116705,"ia_jitter":11,"lsr":916446682,"dlsr":1189831704}]}
(1209 rows)

I confirm that there are plenty of RTCP messages in the DB. What next?

I need to point out that I am also affected by the Homer bug #332 that I cannot specify a custom time-range.

lmangani commented 4 years ago

Thanks - so great, now please try to select a session with the Call-ID you're looking at and expecting results from. If there are results in the DB but not in the UI, we have a path for investigation to follow.

systemcrash commented 4 years ago

I pull up the call, which shows up when searching by Correlation-ID, from the RTCP table column 2. I have results in the DB and in the UI. I click on the session-ID, to open the graph, and the QoS tab quickly disappears (probably as the script realises the tab is empty)

The request to http://myhost/api/v3/call/report/qos looks like


I notice the discrepancy between what the RTCP table has: 662d7428-51a08047bcf-381e3f81 and the QoS request has: 662d7428-51a08047bcf-381e3f81@sipgt-696e2e71 - important?



What now?

This graph is a mix of IPv4 and IPv6...

systemcrash commented 4 years ago

I notice the discrepancy between what the RTCP table has: 662d7428-51a08047bcf-381e3f81 and the QoS request has: 662d7428-51a08047bcf-381e3f81@sipgt-696e2e71 - important?


It turns out, boys and girls, that this is very important. I did a copy -> cURL request and modified the cURL request locally. When I removed the @sipgt-696e2e71 portion, I got a shit-ton of results back. Kind of obvious now that I think about it.

lmangani commented 4 years ago

Thanks for the details - they sure help and you're spot on - The correlation ID must match in order to return results. The real question is rather why the one you have in your RTCP reports is truncated. Could you perhaps confirm by locating any sessions without @ or other special characters in the Call-ID returning RTCP statistics in the UI?

lmangani commented 4 years ago

@systemcrash could you confirm what agent is pushing data into the setup?

systemcrash commented 4 years ago

Our own agent. We largely followed the C and other source implementations of yours.

So here's the thing - calls traverse the b2bua. As does media. The first leg might lack the @blah portion. The second leg (after traversal) will have it.

I can search by the first shorter Call-ID (no @blah portion) and I only get that first call leg graphed, although no QoS. When I compare with the HEP3 packets, I am searching using the Correlation-ID which the RTCP packets had at HEP3 ingress. So it seems there should be a QoS tab for that. When I check in the DB, there are definitely RTCP entries for that Correlation-ID...

systemcrash commented 4 years ago

Tried the cURL trick for that ID, but nothing comes back.

lmangani commented 4 years ago

Those are literally two different sessions. In theory if the B2BUA correlation is configured right in Homer the backend should query both A and B leg call-ids for statistics, logs, etc.

Are the calls correlated in the UI?

systemcrash commented 4 years ago

Those are literally two different sessions. In theory if the B2BUA correlation is configured right in Homer the backend should query both A and B leg call-ids for statistics, logs, etc.

Are the calls correlated in the UI?

Yes - the two legs show - but only if I search using the shorter ID via the correlation-ID field. It seems both calls correlate using the shorter ID only.

Then if I click the longer (more specific) ID, I get both legs in the graph. If I click on the shorter, only the first leg in the graph. Neither show QoS.

I guess that's what you mean by 'correlated'

adubovikov commented 4 years ago

you can "fix" it using a correlation array in the UI settings. In this case you can write a simple javascript function that will cut-off the right part of callid after @ and add this as a new callid and the request will be for two callids: 662d7428-51a08047bcf-381e3f81@sipgt-696e2e71 and 662d7428-51a08047bcf-381e3f81

systemcrash commented 4 years ago

you can "fix" it using a correlation array in the UI settings. In this case you can write a simple javascript function that will cut-off the right part of callid after @ and add this as a new callid and the request will be for two callids: 662d7428-51a08047bcf-381e3f81@sipgt-696e2e71 and 662d7428-51a08047bcf-381e3f81

Thanks - this could be a workaround for the call I mentioned. But this last call I make, I get nothing back when I use the correlation-ID from the RTCP DB entries.

adubovikov commented 4 years ago

Screenshot from 2020-05-13 15-13-00

adubovikov commented 4 years ago

you can "fix" it using a correlation array in the UI settings. In this case you can write a simple javascript function that will cut-off the right part of callid after @ and add this as a new callid and the request will be for two callids: 662d7428-51a08047bcf-381e3f81@sipgt-696e2e71 and 662d7428-51a08047bcf-381e3f81

Thanks - this could be a workaround for the call I mentioned. But this last call I make, I get nothing back when I use the correlation-ID from the RTCP DB entries.

check the query to PG and you will see why it doesnt' work

systemcrash commented 4 years ago

Where can I monitor that?

adubovikov commented 4 years ago

two ways:

  1. activate debug in the webapp
  2. using ngrep on port 5432
lmangani commented 4 years ago

@systemcrash for an unoptimized test, you can stick a '@%' postfix to the A-Leg to find the B-Leg.

systemcrash commented 4 years ago

@systemcrash for an unoptimized test, you can stick a '@%' postfix to the A-Leg to find the B-Leg.

When searching? No results when adding @% via the correlation-id search field

systemcrash commented 4 years ago

two ways:

1. activate debug in  the webapp

Where do I set this?

adubovikov commented 4 years ago

two ways:

1. activate debug in  the webapp

Where do I set this?

let me guess.... in the config file ?

systemcrash commented 4 years ago

@systemcrash for an unoptimized test, you can stick a '@%' postfix to the A-Leg to find the B-Leg.

When searching? No results when adding @% via the correlation-id search field

OK - this works, but only via the Session ID and Call ID columns.

adubovikov commented 4 years ago

@systemcrash you have already the solution - please make an own javascript function and it will solve YOUR issue

systemcrash commented 4 years ago

two ways:

1. activate debug in  the webapp

Where do I set this?

let me guess.... in the config file ?

Oh, great. Well, as we mentioned in the other issue here yesterday this flag doesn't exist yet in the docker files. So when I try the debug flag, no go. :(

systemcrash commented 4 years ago

@systemcrash you have already the solution - please make an own javascript function and it will solve YOUR issue

Nope - overall functionality doesn't change. It's the same as listing search results. And the QoS tab still disappears. :(

adubovikov commented 4 years ago

this flag has nothing todo with DOCKER, this is a system param of homer-app, if you set it, you can see debug ouput

adubovikov commented 4 years ago

@systemcrash you have already the solution - please make an own javascript function and it will solve YOUR issue

Nope - overall functionality doesn't change. It's the same as listing search results. And the QoS tab still disappears. :(

Let me make it short. You have two calls: call-ID: 123456 and 123456@aaa . You have only reports for "123456" and you wanna see reports once you click on "123456@aaa" ? Is it correct ? If yes, you have to do manipulation in the correlation mapping to get real CALLID and send request for QOS for both.

systemcrash commented 4 years ago

this flag has nothing todo with DOCKER, this is a system param of homer-app, if you set it, you can see debug ouput

But where do I set it? I have to set that flag in the docker-compose.yml file so that homer picks it up. Doing this, there is no change in logging.

systemcrash commented 4 years ago

this flag has nothing todo with DOCKER, this is a system param of homer-app, if you set it, you can see debug ouput

But where do I set it? I have to set that flag in the docker-compose.yml file so that homer picks it up. Doing this, there is no change in logging.

docker-compose exec homer-webapp bash
cat /usr/local/homer/etc/webapp_config.json

shows only

  "system_settings": {
    "hostname": "homer",
    "logname": "homer-app.log",
    "logpath": "/usr/local/homer",
    "uuid": "6c0033f0-9f89-40f3-a009-9369c7b271db"

Those new lines which you recently added are not there...

lmangani commented 4 years ago

I think the following correlation mapping block for call -> SIP should do the trick:

        "source_field": "data_header.callid",
        "lookup_id": 1,
        "lookup_profile": "call",
        "lookup_field": "data_header->>'callid'",
        "lookup_range": [
        "input_function_js": "var returnData=[]; for (var i = 0; i < data.length; i++) { returnData.push(data[i]+'@%'); }; returnData;"
systemcrash commented 4 years ago

@systemcrash you have already the solution - please make an own javascript function and it will solve YOUR issue

Nope - overall functionality doesn't change. It's the same as listing search results. And the QoS tab still disappears. :(

Let me make it short. You have two calls: call-ID: 123456 and 123456@aaa . You have only reports for "123456" and you wanna see reports once you click on "123456@aaa" ? Is it correct ? If yes, you have to do manipulation in the correlation mapping to get real CALLID and send request for QOS for both.

Well, no - I have no QoS tab.

I have reports for both legs. Stored under the truncated Correlation-ID in the RTCP table.

I get good graphs for both IDs. I get both legs if I search using the truncated ID. I don't really care about the graphs. They're good.

But no graph shows QoS tab.

systemcrash commented 4 years ago

I think the following correlation mapping block for call -> SIP should do the trick:

        "source_field": "data_header.callid",
        "lookup_id": 1,
        "lookup_profile": "call",
        "lookup_field": "data_header->>'callid'",
        "lookup_range": [
        "input_function_js": "var returnData=[]; for (var i = 0; i < data.length; i++) { returnData.push(data[i]+'@%'); }; returnData;"

Thank you - this is what I figured out from @adubovikov screenshot. Tried this, but still no QoS tab. I see only that it helps to correlate calls a bit more.

lmangani commented 4 years ago

Ok - you need to see the QoS query and check it contains the modified callid too next.

systemcrash commented 4 years ago

Ok - so searching via correlation-ID for e.g. 662d7428-51a08047bcf-381e3f81 shows two calls

662d7428-51a08047bcf-381e3f81@sipgt-696e2e71 and 662d7428-51a08047bcf-381e3f81@sipgt-08f97125

I can only graph one by clicking on it.

I dig into the query stuff, and I see the QoS query contains the @sipgt-696e2e71 portion. How do I truncate this? Mappings? HepSub?

adubovikov commented 4 years ago

no the question: why you have "@sipgt-696e2e71" in the QOS query. Is it callid ?

On Wed, 13 May 2020 at 16:15, Paul Dee wrote:

Ok - so searching for e.g. 662d7428-51a08047bcf-381e3f81 shows two calls

662d7428-51a08047bcf-381e3f81@sipgt-696e2e71 and 662d7428-51a08047bcf-381e3f81@sipgt-08f97125

I can only graph one by clicking on it.

I dig into the query stuff, and I see the QoS query contains the @sipgt-696e2e71 portion. How do I truncate this? Mappings? HepSub?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe .

systemcrash commented 4 years ago

It seems to. I click on the search result to graph that call-ID (result column Session-ID).

systemcrash commented 4 years ago

OK - after your script suggestion, I finally get a QoS tab for a call, but ONLY IF THE CALL ID, SESSION ID, and CORRELATION ID I clicked on CONTAINS NO @portion

systemcrash commented 4 years ago

So ANY call ID which contains an @portion, does not get a QoS graph.

adubovikov commented 4 years ago

fix your function: pseudo code:

var position = callid.has("@");
if(positon != -1) { 
systemcrash commented 4 years ago

We ingest Calls from many endpoints, and they will often already contain an @portion, so being able to handle this in general seems important. Call-ID

   The Call-ID header field acts as a unique identifier to group
   together a series of messages.  It MUST be the same for all requests
   and responses sent by either UA in a dialog.  It SHOULD be the same
   in each registration from a UA.

   Use of cryptographically random identifiers (RFC 1750 [12]) in the
   generation of Call-IDs is RECOMMENDED.  Implementations MAY use the
   form "localid@host".  Call-IDs are case-sensitive and are simply
   compared byte-by-byte.

      Using cryptographically random identifiers provides some
      protection against session hijacking and reduces the likelihood of
      unintentional Call-ID collisions.

   No provisioning or human interface is required for the selection of
   the Call-ID header field value for a request.

   For further information on the Call-ID header field, see Section


systemcrash commented 4 years ago

Let me ask this another way: should correlation-IDs be truncated when they go into the RTCP table? Or should they contain the full call-ID?

systemcrash commented 4 years ago

fix your function: pseudo code:

var position = callid.has("@");
if(positon != -1) { 

This should go in the same mapping->SIP->call?

lmangani commented 4 years ago

Perhaps you should just reverse the logic and truncate the suffix?

        "source_field": "data_header.callid",
        "lookup_id": 1,
        "lookup_profile": "call",
        "lookup_field": "data_header->>'callid'",
        "lookup_range": [
        "input_function_js": "var returnData=[]; for (var i = 0; i < data.length; i++) { returnData.push(data[i].split('@')[0] ); }; returnData;"