oracc / editing

Forum for discussing editing issues in Oracc
0 stars 0 forks source link

incorrect senses on web interface #5

Open JacobLauinger opened 6 years ago

JacobLauinger commented 6 years ago

I have been having an ongoing issue with the website attributing extended senses to words lemmed with just the GW. This issue occurs when the word is written the same way as an attestation that does have the extended sense.

For instance, our glossary entry for šâlu, "ask" shows 6 attestations of the word with the extended sense, "show concern": EA 96, 97, 148, 151, 155, 287. However, the word is lemmed with that sense only in EA 96 and 97. In 148, 151, 155, and 287, the attestations of šâlu are lemmed only with the GW "ask." (In 151 is lemmed as a long form, the other three are short forms.)

stinney commented 6 years ago

This looks like a lemmatizer bug to me so I'm assigning it to me.

JacobLauinger commented 6 years ago

A quick note to let you all know that this issue continues on my end, in case you implemented a bug fix already. For instance, the glossary entry to awīlu shows 23 attestations of the sense "representation of a man," but only one text (EA 14 iii 60) is actually lemmatized with that sense; the other 22 are just "man."

JacobLauinger commented 6 years ago

Hi Steve, Per my message on the thread, I just wanted to raise this issue to see if there has been any development? It is still a big problem for the /amarna subproject. For instance the glossary entry for awīlu is now showing 27 (up from 23) attestations of the sense "representation of a man," though only the one text is lemmatized with the sense. Below are screen shots showing an example from EA 71 on the website as well as the actual atf file. Any help on this would be greatly appreciated, especially as I am moving from editing nouns, verbs, etc into the more fiddly words like prepositions and subordinating conjunctions because I find myself spending a lot of time opening files to correct what seems to be errors on the website only to find that the word is actually lemmatized correctly in the atf file. Thanks so much, and, of course, please let me know if there's anything I can help do on my end.

screen shot 2018-05-05 at 5 38 55 pm

screen shot 2018-05-05 at 5 40 10 pm
JacobLauinger commented 6 years ago

Bumping this issue to see if there is any update? I am still dealing with this issue all over my Amarna project but have had no contact since Steve's Jan 10 response above. Some further supporting documentation: EA 29: 59 is only one attestation of itti with the sense "like" but screenshot 1 shows the glossary giving 60 attestations. Screenshot 2 shows the list of attestations that the website attributing the sense "like" to. Screenshot 3 shows one of those attestations, EA 35: 8, with the mouse cursor hovering over the particular attestation to show that the additional sense. Screenshot 4 shows the atf file for EA 35, showing that attestation of itti is lemmed only as "with". (Fwiw, there are also only about five attestations of the sense "to" despite the 57 that the glossary is showing.)

Thanks for any help resolving this frustrating ongoing issue, and of course, please let me know if I can or should be doing anything differently on my end.

screen shot 2018-06-21 at 11 51 28 am screen shot 2018-06-21 at 11 51 58 am screen shot 2018-06-21 at 11 52 25 am

screen shot 2018-06-21 at 11 53 11 am