Open JonHVU opened 7 years ago
@JonHVU That's a mismatch between CHANNEL_CREATE/CHANNEL_DESTROY events. It seems the module received more channel destroy events than create. I need to fix that logic to at least validate that and not go below zero. Did you reload the module by any chance? I think this could happen if you do a module reload while there are active calls, because it won't remember the calls that are active already.
@moises-silva I'm getting the following negative value as well with active registrations:
HELP freeswitch_registrations_active FreeSWITCH Active Registrations freeswitch_registrations_active -26236 1499776145514
Total actual registrations on this tested switch is 12.
Any ideas?
Thanks
Troy
Sadly, it's buggy. I need to rewrite that to query the core explicitly for the registration counts as opposed to relying on events. The problem is starting FreeSWITCH when there's previous state (e.g registrations already in the db). I hope is not much problem, I'll try to get it done over the weekend.
@moises-silva thanks for the feedback! I've updated to most recent commit and am still experiencing same behavior as well as other oddities at times. Specifically on FS servers that are in PostgreSQL BDR Multi-master schema that are essentially in standby for fail-over. The primaries registrations do replicate the registration data to all in a cluster. I've also noted that on the primary, that active registrations continue to climb exponentially. Hope this helps.
Video of primary in the cluster: https://www.screencast.com/t/9aF7vw76fKe
Video of a standy: https://www.screencast.com/t/2m54fwOKsj7
I may have the queries done improperly. They are as follows:
Active sessions: freeswitch_sessions_active{instance=~"$node:.*"}
ASR: freeswitch_sessions_asr{instance=~"$node:.*"}
Active Calls (last 12 hours): ((freeswitch_sessions_answered_total{instance=~"$node:.*"} - freeswitch_sessions_failed_total{instance=~"$node:.*"}) / (freeswitch_sessions_answered_total{instance=~"$node:.*"} )) * 100
Active Registrations: freeswitch_registrations_active{instance=~"$node:.*"}
Heartbeats: ((freeswitch_heartbeats_total{instance=~"$node:.*"}) / (freeswitch_heartbeats_total{instance=~"$node:.*"} )) * 100
Freeswitch Regisrations Total: freeswitch_registrations_total{instance=~"$node:.*"}`
Please excuse my ignorance, I'm green with regard to Prometheus, Grafana and Rust for that matter. Loving every minute of this though.
I'm interested in seeing if I can correct this via ESL, as you mentioned querying the core via sofia request would be more accurate than log parsing, I'm vague on exactly how to go about doing it though. Your readme makes note of its ability, any chances of nudge in the right direction? I very much appreciate your quality work and other efforts regarding your project!
Once I get a handle on this, what I'd like to focus on next is being able to see registrations on a per domain basis e.g. hard sets for expected registrations for each domain as FS is a great multi-tenant platform. Then alarming on e.g. 10% or more registrations loss on a per domain basis.
Cheers, Troy
Hello,
I also get negative for freeswitch_sessions_active
. It seemed it started for a while and then it stabilized. I just started testing this.
The freeswitch_sessions_active
is the most important value for us now.
Great work, hopefully it will be fixed!
@moises-silva just checking in on your mod_Prometheus as its been a while. Reinstalled / compiled still seeing the same possible issues, perpetual climbing of active registrations, failures, attempts, reg totals, and heartbeat. Is this behavior as intended? Haven't viewed new commits, itching to apply your mod. So much potential! Wish I had time to gander rust. Is it now functional as presented? Am I misunderstanding the mentioned metrics? Thank you for your contribution. Cheers, Troy
Yeah, I wish I had the free time to spend on this but I don't. This module was an experiment to get a module written in Rust interfacing with FreeSWITCH. I'll put up a disclaimer in the README indicating it's broken and is only useful as an example of how to get a Rust module built for FreeSWITCH, but the bugs that were found have not been fixed and I can't really commit to when I'll be able to fix them (even more since I have no use for this module myself at the moment).
Thanks for the feedback @moises-silva
This issue may be solved by setting gauges/counters to the value from FreeSwitch internal counters.
Hi @moises-silva,
Sorry for the noise but we are testing the module on a quiet box, and the sessions active appear to have a negative value;
HELP freeswitch_sessions_active FreeSWITCH Active Sessions freeswitch_sessions_active -197 1495446539381
What could cause this?
Thanks
Jon