GZHoffie / approximate-string-matching

A repository of experimental code and benchmark for greedy/dynamic programming approximate string matching algorithms.
MIT License
2 stars 0 forks source link

Improving GASMA #3

Closed GZHoffie closed 3 years ago

GZHoffie commented 3 years ago

Way of speeding up finding highways:

GZHoffie commented 3 years ago

Idea of shortsighted-GASMA:

  1. Find the hurdle matrix,
  2. Flip all single zeros to ones, flip all single ones to zeros (need benchmark result of effect)
  3. Delete the columns at which all rows is a 1, and +1 to hurdleCost, "The hurdles that you must cross"
  4. Starting from the starting position, find the highways that start within 3 columns, and among those highways choose the longest one
  5. go to the end of the highway and repeat step 4 until no more highway is left.
GZHoffie commented 3 years ago

Benchmark results with simulated data (1000 * L = 100, e = 5 %)

Algorithms Time MAE Correct Rate
Needleman-Wunsch (without backtrack) 36.87 0 1
GASMA 1.935 0.172 0.91
GASMA-Shortsighted 0.815 0.247 0.869

There may be some errors in my implementation, so it needed to be checked again.

TODO:

GZHoffie commented 3 years ago

Problems with the current implementation: