blagae / whitakers_words

Other
23 stars 5 forks source link

Beatus generates IndexError: list index out of range #5

Closed rinkla3024 closed 10 months ago

rinkla3024 commented 2 years ago

Hi,

Running whitaker from the command line after install from a git clone.

regemque works fine:

whitaker words regemque
que                  TACKON
-que = and (enclitic, translated before attached word); completes plerus/uter;
reg.em               N      3 1 ACC S M
rex, regis  N (3rd) M   [XLXAX]
king

beatus fails:


whitaker words beatus
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/bin/whitaker", line 33, in <module>
    sys.exit(load_entry_point('whitakers-words', 'console_scripts', 'whitaker')())
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/alt/whitakers-words/whitakers_words/whitakers_words/cli.py", line 46, in words
    fmt(result, WordsFormatter())
  File "/alt/whitakers-words/whitakers_words/whitakers_words/cli.py", line 50, in fmt
    click.echo(formatter.format_result(word))
  File "/alt/whitakers-words/whitakers_words/whitakers_words/formatter.py", line 76, in format_result
    result += f"{self.format_parts(analysis)}   [{props}]\n"
  File "/alt/whitakers-words/whitakers_words/whitakers_words/formatter.py", line 88, in format_parts
    return format_adj(analysis)
  File "/alt/whitakers-words/whitakers_words/whitakers_words/formatter.py", line 126, in format_adj
    comp_str = f"{root[2]}{comp[0]} -{comp[1]} -{comp[2]}"
IndexError: list index out of rang
```e
blagae commented 2 years ago

Hi @avestella

Thanks for reporting this issue. Since it gets to the formatting logic, I gather that there's no fundamental error in the parsing (which is my main concern for my own use cases). The command line tool is still very, very rudimentary so this kind of naive bug is honestly no big surprise to me. I expect that this will be a trivial fix.

I'll try to look into this over the weekend.

rinkla3024 commented 2 years ago

Thank you for taking a look at it. There are a few more words that throw errors while you're working on it. Try:

non, impiorum, die, est, secus, omnia, a, inania, principes, unum, jugum

Those should provide some good test cases.

blagae commented 2 years ago

hi @avestella ,

I just pushed a fix for the adjectives; it turns out to be a simple problem where I assumed without testing that there would always be comparative and superlative forms.

There are of course other issues, thanks for reporting some examples ! I'll look into them for a bit, but I'm considering taking a peep at the original Whitaker's Words to port the entire formatting logic. As I said, I don't use the words option intensively, so I'm unlikely to find all edge cases myself. That means that people would have to report issues until I've fixed all corner cases reactively, which is obviously not a good user experience.

rinkla3024 commented 2 years ago

thank you. the new commit fixed several of the errors. Still a few errors on

non, est, secus, a

I am pushing a lot of words through it via command line and capturing the output for generating a large body of flashcards which will get loaded into an Anki deck, so I can provide you with lots of edge test cases. If you like I can post them here or somewhere else if you would find it helpful.

blagae commented 2 years ago

Hi, you're welcome to post whatever info on failing words you have. Maybe I can put the words into a very basic regression test that only checks whether exceptions are thrown (and which ones and where)

rinkla3024 commented 2 years ago

Here are a few more test cases:

{
    "non": 1,
    "erit": 1,
    "est": 1,
    "secus": 1,
    "Non": 1,
    "a": 1,
    "sunt": 1,
    "sum": 1,
    "es": 1,
    "Deus": 1,
    "A": 1,
    "mane": 1,
    "Mane": 1,
    "sit": 1,
    "fueritis": 1,
    "Ecce": 1
}
blagae commented 10 months ago

Hi @rinkla3024 ,

Because of another issue raised for this repo, I finally got back to this one. All examples that you raised have a fix now, in the sense that there are still some incomplete responses from the Words tool, but at least they don't crash the formatter anymore.

Thanks for your feedback, and happy Wordsing.