PerseusDL / lexica

Repo for the text files of lexica
Creative Commons Attribution Share Alike 4.0 International
52 stars 23 forks source link

Fix more `orth` issues (972 entries modified) #83

Closed nkprasad12 closed 1 year ago

nkprasad12 commented 1 year ago

This contains 3 commits:

  1. The first splits orths that were incorrectly combined, for example it splits <orth>stūpa, stī-pa</orth> into <orth lang="la" extent="full">stūpa</orth>, <orth lang="la" extent="full">stī-pa</orth>.
  2. The second updates capitalization of orths that were not capitalized; these were detected by checking whenever we the first orth matches the entryFree's key attribute but differs in capitalization. In these cases, the first letters of the orths were changes to upper case to match the keys.
  3. Just a few misc typo fixes that I included in this PR for convenience: fixes an Aenobarbus alt orth from Aen to Ahen ( see pg 55); and removes a few accidental spaces in utinam