barzerman / barzer

barzer engine code
MIT License
2 stars 0 forks source link

beni bug #570

Closed nchepanov closed 11 years ago

nchepanov commented 11 years ago

on this query beni works worse than sphinx search: Варочная поверхность Nardi LG 430 AV N

BENI RESULT:

name: "Поверхность NARDI LG 430 AV DB", cover: 0.705882,
name: "Поверхность NARDI LG 430 AV A", cover: 0.705882,
name: "Поверхность NARDI LG430AVN", cover: 0.676471,
name: "Поверхность NARDI LG430AVBP", cover: 0.676471,
name: "Поверхность NARDI LC 430 AV N", cover: 0.647059,
name: "Поверхность NARDI THS 30 AV N", cover: 0.588235,
name: "Поверхность NARDI TH 30 AV N",cover: 0.588235,
name: "Поверхность NARDI LC 640 AV N",cover: 0.588235,

why the only right answer Поверхность NARDI LG430AVN has lower cover than Поверхность NARDI LG 430 AV A or Поверхность NARDI LG 430 AV DB

barzerman commented 11 years ago

@0xd34df00d please take a look!

0xd34df00d commented 11 years ago

Already on it.

0xd34df00d commented 11 years ago

BTW this stuff required me to adjust MAX_BENI_LENGTH constant on barzer_barz.cpp:572. Is the version on the server the same as the master?

(and why the hell the local const is declared through an enum?)

barzerman commented 11 years ago

the server version is master as of last sarurday

Sent from my iPhone

On Jun 5, 2013, at 3:48 PM, Georg Rudoy notifications@github.com wrote:

BTW this stuff required me to adjust MAX_BENI_LENGTH constant on barzer_barz.cpp:572. Is the version on the server the same as the master?

(and why the hell the local const is declared through an enum?)

— Reply to this email directly or view it on GitHubhttps://github.com/barzerman/barzer/issues/570#issuecomment-18970525 .

0xd34df00d commented 11 years ago

I'm not sure this can be fixed at all, source dataset is unnormalized.

barzerman commented 11 years ago

please elaborate

Sent from my iPhone

On Jun 5, 2013, at 5:58 PM, Georg Rudoy notifications@github.com wrote:

I'm not sure this can be fixed at all, source dataset is unnormalized.

— Reply to this email directly or view it on GitHubhttps://github.com/barzerman/barzer/issues/570#issuecomment-18977068 .

barzerman commented 11 years ago

i think we need to add boost for contiguous matches . lets discuss this ASAP

0xd34df00d commented 11 years ago

Implementing substring matcher sucks — we obviously want fuzzy matches, and fuzzy matching substring in another one is quite a PITA.

I suggest moving entirely to benisland stuff for this — we get the islands, their sizes and positions automatically and can boost our stuff accordingly. Implementing an efficient fuzzy matching algorithm is typically done via ngrams so it would lead to reinventing benisland.

barzerman commented 11 years ago

lets start with strstr

Sent from my iPhone

On Jun 7, 2013, at 12:08 AM, Georg Rudoy notifications@github.com wrote:

Implementing substring matcher sucks — we obviously want fuzzy matches, and fuzzy matching substring in another one is quite a PITA.

I suggest moving entirely to benisland stuff for this — we get the islands, their sizes and positions automatically and can boost our stuff accordingly. Implementing an efficient fuzzy matching algorithm is typically done via ngrams so it would lead to reinventing benisland.

— Reply to this email directly or view it on GitHubhttps://github.com/barzerman/barzer/issues/570#issuecomment-19070684 .

0xd34df00d commented 11 years ago

Is this still an issue? Please triage.

barzerman commented 11 years ago

spring cleaning