Closed achaloyan closed 9 years ago
Thanks, I'll try to review this hopefully later today.
Reported by achaloyan
on 2009-06-09 13:38:24
Accepted
Hi Chaitanya,
I reviewed the patch, just wanted to discuss the remaining issues.
1. Sending procedure and use of RTP header fields
One out-of-band DTMF event SHOULD not be sent in one RTP packet.
Suppose you want to send DTMF "1" and the duration of this event is 100msec (800
samples in case of 8kHz) at timestamp = 2400.
RTP stream should look like
seqnum timestamp marker duration end-of-event
1 2400 1 160 0
2 2400 0 320 0
3 2400 0 480 0
4 2400 0 640 0
5 2400 0 800 1
6 2400 0 800 1
7 2400 0 800 1
Please note, RTP marker bit MUST be set on the first packet of the event, while the
last packet MUST have end-of-event bit set and the last packet SHOULD be
retransmitted. Sequence number MUST be increased, but RTP timestamp MUST be the same
http://tools.ietf.org/html/rfc4733#section-2.2
2. Also consider that audio stream and named-events can be sent simultaneously.
That's why I defined bit masks for them. However this is very rarely used and usually
either audio stream or named-event is transmitted at a time.
3. What if answer from server contains no "telephone-event" in the codec list, but
user is trying to send named-events. What should we do in this case? Drop those
events or use some default payload type for them.
Reported by achaloyan
on 2009-06-10 06:27:23
Hi Arsen,
Thanks for the detailed code review. Shows how much I missed while coding. But
actually I did that to just get something working first and then improve one by one.
For point one, how do you suggest we should proceed?Should I maintain a buffer in
transmit?
I will look into point 2.
For point 3, I think first of all a client should not send a named event if it did
not get a "telephone-event" in the codec list during SDP negotiation, as the whole
purpose of negotiation is to identify the capabilities of the server and the client.
In case the client sends it, then I guess the server should drop it, or maybe we
could drop it as well.What do you say?
Reported by chaitanya.chokkareddy
on 2009-06-10 07:48:16
Clearly it's not possible to implement all the required stuff in a day and I see no
problem over there, at least we get started.
I'll try to look into this more closely later.
Thanks.
Reported by achaloyan
on 2009-06-10 08:18:54
Hi Chaitanya,
My thoughts regarding point 1 and also the user/application level interface.
The typical use case is: user wants to generate/send a DTMF event with the default
or
specified duration.
The other use case is: user wants to start DTMF event transmission, but doesn't know
the duration of the event by that time. Suppose, user pushes a button, and the event
transmission should be started and continued till he releases that button. I believe
you are interested mostly in the first case, thus we can implement only that case
now, but implementation and interface should assume both cases.
Anyway, the event, which generation/transmission is in progress, should be stored in
transmitter and continuously processed.
Reported by achaloyan
on 2009-06-11 07:32:11
Hi Arsen,
As you said rightly, I am now looking at the first use case only.So for that can't
we
use something like :
frame->type |= MEDIA_FRAME_TYPE_EVENT;
char string[5];
sprintf( string, "%c", fr->subclass );
if(strcmp(string,"#")==0)
frame->event_frame.event_id=11;
else if(strcmp(string,"*")==0)
frame->event_frame.event_id=10;
else
frame->event_frame.event_id=atoi(string);
frame->event_frame.event_id=atoi(string);
frame->event_frame.volume=10;
frame->event_frame.edge=1;
frame->event_frame.duration=1000;
I have been sort of using this in one of my applications and it seems to be working
well.
Should we put this in a user level function which users can call?If so where do you
think is the right place to add this function?
Thanks.
-Chaitanya
Reported by chaitanya.chokkareddy
on 2009-08-04 09:34:57
Hi,
thanks for developing DTMF support which is essential in IVR applications. I am
interested in server-side DTMF, though, let me suggest better character <=> event_id
translation:
frame->type |= MEDIA_FRAME_TYPE_EVENT;
if ((fr->subclass >= '0') && (fr->subclass <= '9'))
frame->event_frame.event_id = fr->subclass - '0';
else if (fr->subclass == '*')
frame->event_frame.event_id = 10;
else if (fr->subclass == '#')
frame->event_frame.event_id = 11;
else if ((fr->subclass >= 'A') && (fr->subclass <= 'D'))
frame->event_frame.event_id = 12 + fr->subclass - 'A';
else
frame->event_frame.event_id = 255; /* Invalid DTMF event */
/* ... */
Cheers
- Vali
Reported by tomas.valenta@speechtech.cz
on 2009-08-04 12:28:38
Hi,
Chaitanya, in this case I'll try to apply what we have for now. I mean the
modifications you made so far.
Vali, I agree with you. Just wonder if you could provide two functions, which simply
implement character <=> event_id translations. I'll be happy to apply them.
Thank you both for your input.
Reported by achaloyan
on 2009-08-04 13:26:37
Hi Arsen,
no problem. I am happy I can help a bit.
apr_uint32_t dtmf_char2event_id(const char dtmf_char)
{
if ((dtmf_char >= '0') && (dtmf_char <= '9'))
return dtmf_char - '0';
else if (dtmf_char == '*')
return 10;
else if (dtmf_char == '#')
return 11;
else if ((dtmf_char >= 'A') && (dtmf_char <= 'D'))
return 12 + dtmf_char - 'A';
else
return 255; /* Invalid DTMF event */
}
char event_id2dtmf_char(const apr_uint32_t event_id)
{
if (event_id <= 9)
return '0' + (char)event_id;
else if (event_id == 10)
return '*';
else if (event_id == 11)
return '#';
else if (event_id <= 15)
return 'A' + (char)event_id;
else
return 0; /* Not a DTMF event */
}
Let me also note something to plugin DTMF processing. I can't wait receiving DTMF
digits :-) because I have already implemented processing in my plugin. Now it looks
like DTMF digits will be received as named events in audio stream's write_frame. IMO
this is not very happy solution because named events contain needless much
information such as volume, duration etc. And worse, plugin implementer receives the
same digit each frame throughout its duration. I find this behaviour unnecessarily
low-level, but the final implementation is, of course, up to you.
Maybe I should have written the last note to another thread.
Cheers
- Vali
Reported by tomas.valenta@speechtech.cz
on 2009-08-04 15:40:32
Aw, sorry for mistake. This should be right:
char event_id2dtmf_char(const apr_uint32_t event_id)
{
if (event_id <= 9)
return '0' + (char)event_id;
else if (event_id == 10)
return '*';
else if (event_id == 11)
return '#';
else if (event_id <= 15)
return 'A' + (char)event_id - 12;
else
return 0; /* Not a DTMF event */
}
Reported by tomas.valenta@speechtech.cz
on 2009-08-04 15:48:27
Chaitanya, Vali thanks for the suggestions and patches. Both are applied to the trunk
r1089 and r1090.
Reported by achaloyan
on 2009-08-17 07:54:21
Started
Just an update to clarify current status of this issue.
Sending and receiving procedures must be further enhanced/implemented.
Reported by achaloyan
on 2009-10-18 13:33:40
Watching the discussions I can see that I am not the only one that would like UniMRCP
to support DTMF. Having read carefully and understood the current state I think I
would be able to write a client class, say mpf_dtmf_generator, generating
frames/packets according to RFC4733 (or audio tones alternatively). I am more
interested in server-side though. If the server supported delivering named events to
the engine channel's stream_write(), then I could create mpf_dtmf_detector.
Now it is possible to send either event or audio (not both at a time) from the
client, right? My idea is that the generator would have a function
apt_bool_t stream_read(generator, frame)
returning TRUE if the frame was generated by the detector (sending digits from queue)
or FALSE if there are no digits in queue.
So is this the way we should go? Are you interested?
Reported by tomas.valenta@speechtech.cz
on 2009-10-18 16:47:42
Hi Vali,
I just welcome your initiative and this would be perfect way to go.
Moreover, if you start implementing mpf_generator and mpf_detector entities, I'll try
to go along with you and implement missing functionality in the core.
Reported by achaloyan
on 2009-10-19 09:55:22
Hi everybody,
I have created the top-level classes for dealing with DTMF. Actually I should say
they are a draft. They work fine with in-band digits, but not with out-of-band ones,
since it is not supported in MPF yet. See the attached headers for docs. The idea is
as follows:
in demo_recog_application.c:
#include "mpf_dtmf_generator.h"
static apt_bool_t recog_app_stream_open(mpf_audio_stream_t *stream, mpf_codec_t *codec)
{
dtmf = mpf_dtmf_generator_create(stream, pool);
}
static apt_bool_t recog_app_stream_read(mpf_audio_stream_t *stream, mpf_frame_t *frame)
{
if (mpf_dtmf_generator_sending(dtmf)) {
if (mpf_dtmf_generator_put_frame(dtmf, frame))
return TRUE;
/* Inter-digit silence */
frame->type |= MEDIA_FRAME_TYPE_AUDIO;
memset(frame->codec_frame.buffer, 0, frame->codec_frame.size);
return TRUE;
}
/* Stream other audio */
}
handle_recognition_in_progress(...)
{
mpf_dtmf_generator_enqueue(dtmf, "0123456789*#ABCD");
}
And in demo_recog_engine.c:
#include "mpf_dtmf_detector.h"
static apt_bool_t demo_recog_channel_open(mrcp_engine_channel_t *channel)
{
dtmf = mpf_dtmf_detector_create(channel->termination->audio_stream, channel->pool);
}
static apt_bool_t demo_recog_stream_write(mpf_audio_stream_t *stream, const
mpf_frame_t *frame)
{
mpf_dtmf_detector_get_frame(dtmf, frame);
char c = mpf_dtmf_detector_digit_get(dtmf);
if (c) printf("Got digit: %c\n", c);
}
As I said, in-band DTMF works fine, but with out-of-band DTMF there are following
problems:
- On generator side:
- It is not possible to set RTP marker bit.
- RTP timestamp should not grow for single event.
- How to get telephone-event's ptime?
- mpf_audio_stream_t::rx_event_descriptor should be populated to get
telephone-event's frequency.
- On detector side:
- Receive the events at all.
- Get RTP timestamp for the event.
As I said, this is just a draft of the concept, so please read the sources in
attachment and leave comments. In the ZIP archive attached, there is also a pcap with
out-of-band packtes with the problems described above.
Cheers
- Vali
Reported by tomas.valenta@speechtech.cz
on 2009-10-25 23:06:34
Hi Vali,
I should confirm the draft looks VERY promising. The implementation and outlined
problems (missing functionality) give more confidence over the concepts, because we
seem to be on the same track.
- On generator side:
- It is not possible to set RTP marker bit.
I'll add marker flag to the mpf_frame_t. That marker should allow to pass along
significant events and these events shouldn't be limited only with DTMFs.
For instance, START_OF_NAMED_EVENT, END_OF_UTTERANCE ...
- RTP timestamp should not grow for single event.
It's clear.
- How to get telephone-event's ptime?
By default, ptime of rx_descriptor should be used. Probably it'd be reasonable to
make ptime (updates for events) configurable. I think, users may want/need to not
only configure period of updates, but also to be able to enable/disable the updates
at all. Should be considered for future ...
- mpf_audio_stream_t::rx_event_descriptor should be populated to get
telephone-event's frequency.
I'll look into this later.
- On detector side:
- Receive the events at all.
OK
- Get RTP timestamp for the event.
Not only timestamp, but marker would be required
See long-lasting events
http://tools.ietf.org/html/rfc4733#section-2.5.1.3
Anyway, we can support it later, just info to consider, but both timestamp and marker
should be passed along
So you've given it a perfect start and as I promised, I'll try to go along and
implement outlined above functionality. Most probably I'll dedicate a couple of days
to this issue during the next week. So please wait a bit.
Thanks!
Reported by achaloyan
on 2009-10-26 16:14:09
- On generator side:
- It is not possible to set RTP marker bit.
I'll add marker flag to the mpf_frame_t. That marker should allow to pass along
significant events and these events shouldn't be limited only with DTMFs.
For instance, START_OF_NAMED_EVENT, END_OF_UTTERANCE ...
Now I do not fully understand. I have not heard of e.g. END_OF_UTTERANCE
telephone-event. You plan those events to be MPF internal events indicating/changing
named-event's state? If not, I should not call the classes dtmf_generator and
dtmf_detector. Well, there should be another layer actually, because almost only
DTMFs can be sent in-band as well.
- RTP timestamp should not grow for single event.
It's clear.
- How to get telephone-event's ptime?
By default, ptime of rx_descriptor should be used. Probably it'd be reasonable to
make ptime (updates for events) configurable. I think, users may want/need to not
only configure period of updates, but also to be able to enable/disable the updates
at all. Should be considered for future ...
Clear. But honestly I cannot recall any non-DTMF (maybe MF) events used in IVR apps.
Packet with event should be used every ptime. How can I get rx_descriptor's ptime?
- mpf_audio_stream_t::rx_event_descriptor should be populated to get
telephone-event's frequency.
I'll look into this later.
- On detector side:
- Receive the events at all.
OK
- Get RTP timestamp for the event.
Not only timestamp, but marker would be required
See long-lasting events
http://tools.ietf.org/html/rfc4733#section-2.5.1.3
Anyway, we can support it later, just info to consider, but both timestamp and marker
should be passed along
Clear, I do not plan to care about marker in detector, because the marked packet may
get lost. But anyway I agree it is a must.
Thank you for ideas and support.
- Vali
Reported by tomas.valenta@speechtech.cz
on 2009-10-26 22:01:24
Below is clarification on marker, more to come later.
As you know, marker bit in RTP header is used not only in case of named events.
Marker also indicates start of a new talkspurt. Typically RTP stream consists of
voice activity and inactivity segments. Currently I implicitly detect these segments
in rtp_transmit. However, I think it makes sense to add generic marker in mpf_frame,
which will allow user apps to explicitly indicate start-of-talkspurt (or
start-of-utterance, start-of-event) and the same way end-of-talkspurt (or
end-of-utterance, end-of-event), when possible. There can be more flags/markers in
the future. If marker isn't specified everything should work as is. So this marker
should serve well for named events, and can be used for other purposes as well.
On the receiver side you definitely shouldn't rely on marker, because it indeed may
be lost. However, presence of that marker may help a lot in analyzes. Actually,
receiver part isn't as trivial as it may seen, but everything seems to be achievable.
Reported by achaloyan
on 2009-10-27 10:53:11
See http://groups.google.com/group/unimrcp/browse_thread/thread/8e773f260c56c34e
Reported by achaloyan
on 2009-11-09 19:27:08
Fixed
Reported by achaloyan
on 2010-04-27 12:01:38
Verified
Originally reported on Google Code with ID 31
Reported by
chaitanya.chokkareddy
on 2009-06-09 12:46:33