girzel / ebdb

An EIEIO port of BBDB, Emacs' contact-management package
67 stars 12 forks source link

EBDB mangles a particular name #106

Closed wyleyr closed 1 year ago

wyleyr commented 1 year ago

I'm new to EBDB and I've encountered a weird bug when entering a contact: entering the name "Reddemann" as a contact's (last) name consistently causes EBDB to split it up and duplicate "Redde", producing "Redde Redde mann". This a contact with a German address; I have the ebdb-i18n package loaded.

I've experimented a bit and see:

But:

So it seems like the strings "edde" and "ede" are the problem. Any idea what could be causing this?? I'm happy to help debug if you can tell me where to look.

girzel commented 1 year ago

Ugh, the name parsing code needs to be rewritten, I think with a proper parser instead of a pile of regular expressions. This has been on my list for a while now, but obviously we're hitting regressions so I will bump up the priority. Thanks for the report.

franburstall commented 1 year ago

More on this: ebdb-divide-name splits the surname on any match of ebdb-lastname-prefixes no matter where it occurs in the string. Thus

(ebdb-divide-name "George Holderforce")

yields

("rforth" ("George") nil "Holde")

One should probably test for prefixes only at the start of surnames and maybe separated by spaces from the rest?

girzel commented 1 year ago

Sorry for the very long wait here! And thanks to @franburstall for identifying the problem. I've been trying out a fancier parsing library to solve this and other issues, in order to avoid the regexp hairball issue. So far I haven't been able to make that work, though, and it's fairly clear how to fix this problem, so I'm just adding to the hairball for now. This fix will be out with a new point release in the next day or so. Thanks again.

wyleyr commented 1 year ago

This is great, thanks so much to you both for your efforts!