GNUAspell / aspell

http://aspell.net
GNU Lesser General Public License v2.1
243 stars 53 forks source link

Words in dictionary aren't recognized #476

Open aspell-helper opened 14 years ago

aspell-helper commented 14 years ago

Kevin Scannell cos\@sf created a bug report on 2010-07-26 16:02:39 UTC (Orig. from https://sourceforge.net/p/aspell/bugs/243)

Using Aspell 0.60.6 on Ubuntu, and a fresh install of the Irish dictionary aspell5-ga-4.4-0.tar.bz2.

Immediately after "sudo make install" of the dictionary, I run this, expecting no output:

$ /usr/bin/word-list-compress d < ga.cwl | iconv -f iso-8859-1 -t utf8 | aspell --lang=ga list d'orgán m'orgán n-arm t-arm

These four aren't recognized though. The other 326042 are ok!

aspell-helper commented 13 years ago

Kevin Atkinson kevina\@sf updated the issue on 2011-06-27 23:38:59 UTC

aspell-helper commented 13 years ago

Kevin Atkinson kevina\@sf updated the issue on 2011-07-03 22:03:54 UTC

aspell-helper commented 13 years ago

Kevin Atkinson kevina\@sf commented on 2011-07-03 22:03:54 UTC

So the problem is that d'orgán and Dorgan are similar in that they both have the same "clean" value of "dorgan" but the soundslike is different at "T*R*K*" and "T*R*K*N" respectively, which violates some of my assumptions I made. Not sure how I am going to fix this.

aspell-helper commented 13 years ago

Kevin Atkinson kevina\@sf commented on 2011-07-04 01:06:51 UTC

And fixing this will almost certainly require breaking the dictionary format, further complicating things.

aspell-helper commented 13 years ago

Kevin Scannell cos\@sf commented on 2011-07-18 15:27:21 UTC

Ok, maybe we're honing in on the problem. Both of those words *should* have a soundslike of "T*R*K*N". But I can't find a problem in the gaeilge_phonet.dat file.

As a simpler example, consider "organ". Should have a soundalike of *R*K*N but it comes out as *R*K*

The rule that's causing the trouble appears to be:

R(BGM)- R*

I think this because, for example, the string "oragan" correctly gives *R*K*N.

Am I not allowed to use the - syntax together with characters in parens as above? That syntax seems to work correctly other places.

aspell-helper commented 13 years ago

Kevin Atkinson kevina\@sf updated the issue on 2011-07-19 18:29:08 UTC

aspell-helper commented 13 years ago

Kevin Atkinson kevina\@sf commented on 2011-07-19 18:29:08 UTC

There could also be a bug in the phonet code. I did not write the original code, and it has been a while since I looked at it. I will try to have a look sometime soon.

If you fell so inclined you are welcome to look for yourself.