Closed thatbudakguy closed 4 years ago
I forgot to respond to the comment about synonyms on #21 — moving here since this issue is specific to the synonyms.
Fifth, re synonyms, maybe it does help to have them. Maybe, in particular "1 and one" of "፩ and አሐዱ፡" which will be the most common substitution. Or, did you say you can't do it with single characters?
Here's the synonym configuration file I created based on the spreadsheet you provided. (I added the corresponding arabic numerals.) I think that they are working as synonyms but not highlighting the synonym word, but it's a little hard to tell. This might be another one you could test more explicitly by manufacturing some examples while you're testing #30
My only question is about the Arabic numerals. I'm not sure if we should do that? For instance, does that mean that the 19 in this incipit would be converted into Ge'ez: ሀለወት፡ አሐቲ፡ ብእሲት፡ ዘቦአት፡ ው(f. 19vb)ስተ፡ ቤተ፡ ሞቅሕ፤ ዘውእቱ፤ ጸማዕት፡ ይእቲ፤ ብእሲት ፨ ወሖራ፤ {er. } ኃቤሃ፡ ክልኤ፤ አሐተ፤ ውርዝዋት፤ በስኖን፤ ወልሂቃት፤ በምግባሮን፤ ከመ፤ የሐውጻሃ፤ ወሶበ፤ ርእየቶን፤ ተአምኃቶን፨…
@WendyLBelcher I think it only does individual tokens, which right now is only 1-5. It doesn't convert it exactly, it just searches on all variants of that word.
I don't think there's likely to be any harm in including it, but I'd be glad to remove it because I'm not sure how helpful it is either.
Okay, awesome. Let's leave as is, and I'm closing this issue.
notes
initial request from @WendyLBelcher posted in #25:
See also: Solr documentation on synonyms https://lucene.apache.org/solr/guide/6_6/managed-resources.html#ManagedResources-Synonyms