CyCoreSystems / ari-proxy

NATS or RabbitMQ message bus Asterisk REST Interface proxy system implemented in Go
Other
79 stars 34 forks source link

Runtime error using NATS client without go ARI package #28

Closed vitorespindola closed 4 years ago

vitorespindola commented 4 years ago

Hi!

Thank for this great project.

I'm connecting into the proxy using a python client and can manage to exchange a few messages successfully. I'm getting this error after sending a ChannelRing to a channel:

"panic: runtime error: invalid memory address or nil pointer dereference" error

The ChannelRing is executed successfully, but this error appear right after and the proxy process crashes.

ari-proxy_1 | t=2020-03-30T18:55:14+0000 lvl=info msg="starting ari-proxy server" version=5.0.1 ari-proxy_1 | t=2020-03-30T18:55:18+0000 lvl=warn msg="failed to publish NATS message" subject= data="&{Error: Data:<nil> Key:<nil> Keys:[]}" error="nats: invalid subject" ari-proxy_1 | panic: runtime error: invalid memory address or nil pointer dereference ari-proxy_1 | [signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x867b53] ari-proxy_1 | ari-proxy_1 | goroutine 45 [running]: ari-proxy_1 | github.com/CyCoreSystems/ari-proxy/v5/server.(*Server).dialogsForEvent(0xc00009c500, 0xb77140, 0xc0001be100, 0xa3cfc0, 0xc0001be100, 0xc0000266f0) ari-proxy_1 | /home/travis/gopath/src/github.com/CyCoreSystems/ari-proxy/server/events.go:7 +0x113 ari-proxy_1 | github.com/CyCoreSystems/ari-proxy/v5/server.(*Server).runEventHandler(0xc00009c500, 0xb72a40, 0xc00005fb80) ari-proxy_1 | /home/travis/gopath/src/github.com/CyCoreSystems/ari-proxy/server/server.go:276 +0x51d ari-proxy_1 | created by github.com/CyCoreSystems/ari-proxy/v5/server.(*Server).listen ari-proxy_1 | /home/travis/gopath/src/github.com/CyCoreSystems/ari-proxy/server/server.go:221 +0x138e vagrant_ari-proxy_1 exited with code 2

Ulexus commented 4 years ago

There is no ChannelRing event. Valid events are listed here canonically or here in the Go ARI client code.

Can you describe your setup a little more? What is connected to what?

vitorespindola commented 4 years ago

Hi!

I meant ChannelRing command.

My env:

Asterisk 16.6.2 nats-server: v2.1.4 ari-proxy - docker latest image

This is a simple python script that generates the issue. It connects to nats and send a ChannelRing command on StasisStart event.

import asyncio
import json
from nats.aio.client import Client as NATS
from nats.aio.errors import ErrConnectionClosed, ErrTimeout, ErrNoServers

async def run(loop):
    nc = NATS()

    await nc.connect("nats://192.168.0.110:4222", loop=loop)

    async def message_handler(msg):
        subject = msg.subject
        reply = msg.reply
        data = json.loads(msg.data.decode())
        print(f"Received a message on '{subject} {reply}': {data}")
        if data.get('type') == 'StasisStart':
            print(data['channel']['id'])
            await ring(data['channel']['id'])

    async def ring(channel_id):
        cmd = {"Kind": "ChannelRing",
               "Key": {"id": channel_id}}
        try:
            await nc.publish("ari.command.endpoint-route",
                             json.dumps(cmd).encode())
        except ErrConnectionClosed as e:
            print("Connection closed prematurely.")
        except ErrTimeout as e:
            print(f"Timeout occured when publishing msg i={i}: {e}")
        except Exception as e:
            print(f"Error {e}")

    await nc.subscribe("ari.event.>", cb=message_handler)

if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    loop.run_until_complete(run(loop))
    try:
        loop.run_forever()
    finally:
        loop.close()
Ulexus commented 4 years ago

It's pretty important that you store and use the entire key you receive from that StasisStart event, since it contains additional routing information which is necessary when you scale beyond a single Asterisk instance.

I don't know off-hand that that is the problem, but it looks like it is going to be something like that. From your earlier post, the event that ari-proxy is getting from Asterisk is empty or unable to be parsed. If you turn on debug logging in ari-proxy, you should be able to see what is being received from Asterisk (at least the type).

vitorespindola commented 4 years ago

ari-proxy log with verbose enabled:

INFO[03-30|19:56:53] verbose logging enabled
INFO[03-30|19:56:53] starting ari-proxy server                version=master
DBUG[03-30|19:56:53] starting listener
DBUG[03-30|19:56:53] listening for events                     application=endpoint-route
DBUG[03-30|19:57:03] event received                           kind=ChannelVarset
DBUG[03-30|19:57:03] listening for events                     application=endpoint-route
DBUG[03-30|19:57:03] event received                           kind=ChannelCreated
DBUG[03-30|19:57:03] listening for events                     application=endpoint-route
DBUG[03-30|19:57:03] event received                           kind=ChannelDialplan
DBUG[03-30|19:57:03] listening for events                     application=endpoint-route
DBUG[03-30|19:57:03] event received                           kind=ChannelVarset
DBUG[03-30|19:57:03] listening for events                     application=endpoint-route
DBUG[03-30|19:57:03] event received                           kind=StasisStart
DBUG[03-30|19:57:03] listening for events                     application=endpoint-route
DBUG[03-30|19:57:03] received request                         kind=ChannelRing
WARN[03-30|19:57:03] failed to publish NATS message           subject= data="&{Error: Data:<nil> Key:<nil> Keys:[]}" error="nats: invalid subject"
DBUG[03-30|19:57:03] event received                           kind=DeviceStateChanged
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x876d03]

goroutine 42 [running]:
github.com/CyCoreSystems/ari-proxy/v5/server.(*Server).dialogsForEvent(0xc000020600, 0xb855e0, 0xc000021680, 0xa4fbc0, 0xc000021680, 0xc00002e7e0)
    /usr/src/ari-proxy/server/events.go:7 +0x113
github.com/CyCoreSystems/ari-proxy/v5/server.(*Server).runEventHandler(0xc000020600, 0xb807e0, 0xc000063c80)
    /usr/src/ari-proxy/server/server.go:276 +0x518
created by github.com/CyCoreSystems/ari-proxy/v5/server.(*Server).listen
    /usr/src/ari-proxy/server/server.go:221 +0x1275
Ulexus commented 4 years ago

Just before you get that error, you should see a message go out on NATS on the event subject. Can you trace that or capture it from your Python side? I'm guessing it is that DeviceStateChanged event.

Ulexus commented 4 years ago

Oh, never mind; that's the failed to publish NATS message before that.

vitorespindola commented 4 years ago

Yes, looks like the problem is with DeviceStateChanged event.

Here is a log after hangup a channel which is already active before starting the proxy:

INFO[03-30|20:03:13] verbose logging enabled
INFO[03-30|20:03:13] starting ari-proxy server                version=master
DBUG[03-30|20:03:13] starting listener
DBUG[03-30|20:03:13] listening for events                     application=endpoint-route
DBUG[03-30|20:03:19] event received                           kind=ChannelHangupRequest
DBUG[03-30|20:03:19] listening for events                     application=endpoint-route
DBUG[03-30|20:03:19] event received                           kind=StasisEnd
DBUG[03-30|20:03:19] listening for events                     application=endpoint-route
DBUG[03-30|20:03:19] event received                           kind=ChannelVarset
DBUG[03-30|20:03:19] listening for events                     application=endpoint-route
DBUG[03-30|20:03:19] event received                           kind=ChannelHangupRequest
DBUG[03-30|20:03:19] listening for events                     application=endpoint-route
DBUG[03-30|20:03:19] event received                           kind=ChannelDestroyed
DBUG[03-30|20:03:19] listening for events                     application=endpoint-route
DBUG[03-30|20:03:19] event received                           kind=DeviceStateChanged
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x876d03]

goroutine 42 [running]:
github.com/CyCoreSystems/ari-proxy/v5/server.(*Server).dialogsForEvent(0xc000020600, 0xb855e0, 0xc000021280, 0xa4fbc0, 0xc000021280, 0xc00002e750)
    /usr/src/ari-proxy/server/events.go:7 +0x113
github.com/CyCoreSystems/ari-proxy/v5/server.(*Server).runEventHandler(0xc000020600, 0xb807e0, 0xc000063c80)
    /usr/src/ari-proxy/server/server.go:276 +0x518
vitorespindola commented 4 years ago

DeviceStateChanged full event.

{
  "type": "DeviceStateChanged",
  "application": "endpoint-route",
  "timestamp": "2020-03-30T20:06:41.769+0000",
  "device_state": {
    "name": "PJSIP/6000",
    "state": "INUSE"
  },
  "asterisk_id": "52:54:00:e3:27:77"
}
Ulexus commented 4 years ago

It seems unlikely that that is actually our culprit. Are you seeing and receiving that event in NATS?

vitorespindola commented 4 years ago

It seems unlikely that that is actually our culprit. Are you seeing and receiving that event in NATS?

No.

Ulexus commented 4 years ago

Okay, in that case, would you mind adding a log entry when a nil key is received? Inside your nil check, just add an s.Log.Debug("received nil key for event", "event", e) so we can see what is in that post-parsed event which is causing that nil key.

vitorespindola commented 4 years ago
DBUG[03-30|20:17:54] received nil key for event               event="&{EventData:{Application:endpoint-route Dialog: Node:52:54:00:e3:27:77 Timestamp:2020-03-30 20:17:54.746 +0000 UTC Type:DeviceStateChanged} Header:map[] DeviceState:{Key:<nil> State:INUSE}}"
Ulexus commented 4 years ago

Well that's definitely it, then.. if you wouldn't mind adding that log entry (though it should probably be a log.Warn instead of log.Debug to your PR, I'll go ahead and merge it, then see if I can replicate it.

Ulexus commented 4 years ago

Thanks for your patience.

Ulexus commented 4 years ago

Released as v5.0.2

vitorespindola commented 4 years ago

Thank you!

Ulexus commented 4 years ago

No problem; I just changed it to Warn. Thanks