szechyjs / dsd

Digital Speech Decoder
Other
701 stars 271 forks source link

Heuristics Segmentation Fault #33

Closed robotastic closed 10 years ago

robotastic commented 10 years ago

Hi - I am getting a segmentation fault from the new heuristics error correction. I have incorporated DSD into a gnuradio block and I am running several instance in parallel, so it could easily be that. However, I think there might be an error in the DSD code.

In p25p1_tdulc.c, on line 291 read_zeros gets called with new sequence set to zero.

As a result, sequence_broken for the first element of analog_signal_array is 0 on line 194.

Next contribute_heuristics is called. That brings us to line 186 of p25p1_heuristics.c.

prev_dibit = analog_signal_array[i-1].corrected_dibit;

This is in a for loop that sets i from 0 to 10. I am getting the segmentation fault when i=0 because a crazy value is pull from analog_signal_array is pulling a crazy value out and setting prev_dbit to it. This then gets passed to update_p25_heuristics and prev_dbit is used to pull the symbol heuristics:

sh = &(heuristics->symbols[previous_dibit][dibit]);

In short, it seems like in p25p1_tdulc.c, on line 291 new sequence should be set to 1. I made the switch and it seems to prevent the seg fault.

Also in p25p1_tdu.c on line 21, read_zeros is called with new sequence set to 1.

robotastic commented 10 years ago

Here is the core dump I get:

#0  0x00007fe2349c7bab in update_p25_heuristics (heuristics=0x101cc9c, previous_dibit=857912528, original_dibit=0, dibit=0, analog_value=13628)
    at /home/luke/gnuradio3.6/gr-dsd/dsd/p25p1_heuristics.c:126
#1  0x00007fe2349c7ea4 in contribute_to_heuristics (rf_mod=0, heuristics=0x101cc9c, analog_signal_array=0x7fe228c50fc0, count=10)
    at /home/luke/gnuradio3.6/gr-dsd/dsd/p25p1_heuristics.c:196
#2  0x00007fe2349d4c36 in read_zeros (opts=0x10184f0, state=0x101a1d8, analog_signal_array=0x7fe228c50fc0, length=20, status_count=0x7fe228c5060c, new_sequence=0)
    at /home/luke/gnuradio3.6/gr-dsd/dsd/p25p1_tdulc.c:210
#3  0x00007fe2349d4ee7 in processTDULC (opts=0x10184f0, state=0x101a1d8) at /home/luke/gnuradio3.6/gr-dsd/dsd/p25p1_tdulc.c:291
#4  0x00007fe2349cb111 in processFrame (opts=0x10184f0, state=0x101a1d8) at /home/luke/gnuradio3.6/gr-dsd/dsd/dsd_frame.c:428
#5  0x00007fe2349c5e30 in liveScanner (opts=0x10184f0, state=0x101a1d8) at /home/luke/gnuradio3.6/gr-dsd/dsd/dsd_main.c:376
#6  0x00007fe2349c2d19 in run_dsd(void*) () from /usr/local/lib/libgr-dsd.so
#7  0x00007fe234c15f6e in start_thread (arg=0x7fe228c52700) at pthread_create.c:311
#8  0x00007fe2332a29cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
EdFuentetaja commented 10 years ago

Hi Robotastic. Thank you for pointing this out. I believe you are right there is a bug in the code and you are very close in the diagnostics. At first look the call to read_zeros with new_sequence set to 1 at p25_p1_tdul.c line 291 seems correct. Mind that we are passing the address of analog_signalarray but with an offset of 6(6+6)+6_(6+6) elements (element = dibit), so at contribute_heuristics access to element [i-1] with i=0 is in fact the element analog_signalarray[6(6+6)+6_(6+6) - 1] which should have been initialized... The problem is that there is an execution path that doesn't initialize it (specifically the corrected_dibit field). This is initialized by correct_golay_dibits_12 and it looks like if irrecoverable_errors is set to 1 it never gets called. I think that is the problem you are experiencing.

So if irrecoverableerrors is set to 1, then we don't know the actual value of element [6(6+6)+6_(6+6) - 1] and therefore we should consider the sequence has broken. With irrecoverable_errors != 1 is correct to call with new_sequence set to 1. At p25p1_tdu is correct to make the call with new_sequence set to 1 because there we don't know the value of the previous element.

Since it seems you are able to reproduce this issue, can you confirm that in fact you are getting this segmentation fault with irrecoverable_errors = 1 on line 254?

Also, since you are calling with several instances in parallel, be sure you are using a different instance of dsd_state on each.

Thanks a lot for your analysis. I apologize for the code being so confusing.

robotastic commented 10 years ago

Thanks so much for writing back so quick. I will test this out tonight. I did see that the pointer for analog_signal_array was moved up a bit, but I thought that having a negative index for an array might have it do something weird, like wrap around. Anyhow, I will go check it out.

I do have a separate state for each instance. Also in dsd_filters.c I moved:

static float xv[NZEROS+1];

and

static float nxv[NXZEROS+1];

into state, because it looks like the static variable would stay the same between class instance and could interfer with each. Is that right? Do I have to worry about any of the static function in p25p1_heuristics.c? It looks like all of the state is saved into the state variables.

robotastic commented 10 years ago

I just checked and it does look like irrecoverable_errors is set to 1. Here is the dump from GDB:

#0  0x00007f08e50085db in update_p25_heuristics (heuristics=0x1daf62c, previous_dibit=-477707056, original_dibit=0, dibit=0, analog_value=13517)
    at /home/luke/gnuradio3.6/gr-dsd/dsd/p25p1_heuristics.c:125
#1  0x00007f08e50088d4 in contribute_to_heuristics (rf_mod=0, heuristics=0x1daf62c, analog_signal_array=0x7f08d2e71fc0, count=10)
    at /home/luke/gnuradio3.6/gr-dsd/dsd/p25p1_heuristics.c:195
#2  0x00007f08e5015666 in read_zeros (opts=0x1daae80, state=0x1dacb68, analog_signal_array=0x7f08d2e71fc0, length=20, status_count=0x7f08d2e7160c, new_sequence=0)
    at /home/luke/gnuradio3.6/gr-dsd/dsd/p25p1_tdulc.c:210
#3  0x00007f08e5015917 in processTDULC (opts=0x1daae80, state=0x1dacb68) at /home/luke/gnuradio3.6/gr-dsd/dsd/p25p1_tdulc.c:291
#4  0x00007f08e500bb41 in processFrame (opts=0x1daae80, state=0x1dacb68) at /home/luke/gnuradio3.6/gr-dsd/dsd/dsd_frame.c:428
#5  0x00007f08e5006860 in liveScanner (opts=0x1daae80, state=0x1dacb68) at /home/luke/gnuradio3.6/gr-dsd/dsd/dsd_main.c:376
#6  0x00007f08e5003749 in run_dsd(void*) () from /usr/local/lib/libgr-dsd.so
#7  0x00007f08e5256f6e in start_thread (arg=0x7f08d2e73700) at pthread_create.c:311
#8  0x00007f08e38e39cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
(gdb) frame 3
#3  0x00007f08e5015917 in processTDULC (opts=0x1daae80, state=0x1dacb68) at /home/luke/gnuradio3.6/gr-dsd/dsd/p25p1_tdulc.c:291
291   read_zeros(opts, state, analog_signal_array + 6*(6+6)+6*(6+6), 20, &status_count, 0);
(gdb) print irrecoverable_errors
$1 = 1

I will poke around more...

robotastic commented 10 years ago

Ed - Thanks again for all the great additions you have been adding to DSD over the past few months. The audio is sounding so much better. I ran it again and this time printed out irrecoverable error. Everything works fine until an error happens and then it seg faults. I will submit the change I made that seems to fix it.

irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 0 
irrecoverable_errors: 1 
./start_hack_rf_decode_856.sh: line 6:  8041 Segmentation fault      (core dumped)
EdFuentetaja commented 10 years ago

Yes, so I think we got it. The fix you submitted is good enough. Thanks!

It seems you got a recording that you are using for testing this issue. Maybe you can contribute it to the dsd-samples repository? I'll be useful to do regression testing of this issue.

On the xv and nxv static variables, yes, if you are using the code in multithread you need to move those to dsd_state and use a different instance as you are already doing. In general do so for any static or global variable that does not hold a constant value. Static methods got a different meaning in C. This is just to limit its visibility to the compilation unit where they are defined, there is no effect on multithread. Maybe you can also submit a patch with this improvement, even if the code is used in just a single thread this makes it more consistent and easier to maintain.

Kind regards,

Ed

szechyjs commented 10 years ago

Fixed by #36

Also, any sample files would be appreciated.

robotastic commented 10 years ago

Hi guys - I just created a project with lots of samples: https://github.com/robotastic/DSD-P25-Samples I can also miss tune it slightly more to get a worse decode. It is uploading right now, so it should be available in a little.

szechyjs commented 10 years ago

There is actually the official https://github.com/szechyjs/dsd-samples repo that is a submodule in dsd. It would be easier if you forked that and submitted a PR.

robotastic commented 10 years ago

My system automatically captures and records calls off a trunking system. I didn't look at how many recordings I had... turns out it is ~300 and it is about a gig. Instead of using github, I put them up on Dropbox. If there is a better way to share them, let me know. I could try adding them to dsd-samples, but I was afraid it would be way to big.

https://www.dropbox.com/sh/ktld5jihnm8976p/AAAZKREwvx2TUm10QFY2S70Ea

On Sun, Jun 1, 2014 at 3:55 PM, Jared Szechy notifications@github.com wrote:

There is actually the official https://github.com/szechyjs/dsd-samples repo that is a submodule in dsd. It would be easier if you forked that and submitted a PR.

— Reply to this email directly or view it on GitHub https://github.com/szechyjs/dsd/issues/33#issuecomment-44787374.

szechyjs commented 10 years ago

Yea that's definitely more than we need. If you want to pick out a variety of files from different scenarios and add them to the repo that would be pretty useful. I'm thinking of scenarios like, good signal quality, poor signal quality, long clip, short clip, etc.