ostafen / smart

String Matching Algorithms Research Tool
https://smart-tool.github.io/smart/
GNU General Public License v3.0
4 stars 2 forks source link

Segfault with run #59

Closed nishihatapalmer closed 1 year ago

nishihatapalmer commented 1 year ago

If you run smart run bf -text genome -plen 1 2 then there is a seg fault.

The seg fault is primarily caused because the software tries to reference a results pointer which is NULL.

But it is not NULL in the calling method. So looks like some corruption happening somewhere, but only with these short pattern lengths so far.

I chose the bf algorithm to replicate with as it has no memory allocation of its own and is extremely simple. Error must be in smart somewhere.

nishihatapalmer commented 1 year ago

In fact, you don't have to specify plen at all. Just run smart run bf -text genome and see it. It always seems to happen after the first benchmark has completed sucessfully, then seg-faults on the second pattern length it attempts.

nishihatapalmer commented 1 year ago

This is a regression - run worked fine before.

nishihatapalmer commented 1 year ago

It isn't the -text parameter. You can also use rand 5 or something and get the same seg fault.

nishihatapalmer commented 1 year ago

I identified the commit 00f906ce9a714cae68d6e0047152ccfc8a4a171d which was the recent PR for pattern length control where the problem first appears. However, nothing in the difference between that code and the previous commit stands out as causing a problem.

It is the algo_results_t *results which becomes NULL between method calls of benchmark_algos_with_text calling to benchmark_algos_with_patterns. Access to the NULL results pointer then causes a seg fault. This is some kind of memory corruption going on somewhere.

If you run the code without using the results struct at all, by commenting out a few bits of code and benchmarking without storing to the results, then no seg fault occurs and the program runs to completion. However, this is only avoiding the issue.

It is possible that the issue originated further back but only becomes manifest in that particular commit due to code organization changes. Different memory gets overwritten, but doesn't cause a problem, until it does.

nishihatapalmer commented 1 year ago

It could still be the result of how the results memory is being allocated somehow. We only dynamically allocate memory for results and patterns, and we aren't writing to memory, except to the results struct after everything is initialized.

Commenting out the code that copies memory to the patterns does not stop the seg fault. So it's not writing to pattern memory that is corrupting the pointer.

nishihatapalmer commented 1 year ago

Fixed by PR #61

nishihatapalmer commented 1 year ago

The issue was not memory corruption; it was returning an incorrect number of patterns (accidentally left the same returns when refactoring the logic). So it was just passing a null pointer in. Easy fix to return correct number of patterns.