Princeton-CDH / pemm-scripts

scripts & tools for the Princeton Ethiopian Miracles of Mary project
Apache License 2.0
1 stars 0 forks source link

As a researcher, I want high-confidence incipits from Google Sheets automatically indexed into the incipit search tool so that my searches are based on current data. #30

Closed rlskoeser closed 4 years ago

rlskoeser commented 4 years ago

Notes for testing

Incipit data from the test spreadsheet TEST PEMM 2020-02-21 should be indexed in the incipit search. Right now, the script is configured to run every 15 minutes. Make changes in the spreadsheet and then check the incipit search after waiting an appropriate period.

elambrinaki commented 4 years ago

Test 1 Added the following incipit "ተኣምሪሃ፡ ለእግዝእትነ፡ ቅድስት፡ ድንግል፡ ማርያም፡ ጸሎታ፡ ወበረከታ፡ ያሀሉ፡ ምስለ፡ ገብራ፡ {er. } ክርስቶስ፡ ወምስለ፡ ኵሎሙ፡ አግብርቲሃ፡ በነግህ፡ ወሠርክ። ለዓለመ፡ ዓለም፡ አሜን፨ ወሀለወት፡ አሐቲ፡ ቤተ፡ ክርስቲያን፡ በሀገረ፡ ሳም፡ ዘተሐንፀት፡ በስመ፡ እግዝእትነ፤ ቅድስት፡ ድንግል፤ በክልኡ፡ ማርያም፡ ወባቲ፡ ንዋየ፡ ብዙኃ፨ ወበአ(f. 23rb)ሐቲ፤ እመዋዕል፡ ተማከሩ፡ ፈያት፤ ከመ፡ ይሥርቁ፡ ንዋየ፡ ቤተ፡ ክርስቲያን፨ " (taken from PEM (PUL) 8, Miracle 13: ff. 23ra-25ra) to PEM (PUL) 8, Miracle 24: ff. 43ra-43vb, canonical story 311. Added the high-confidence label.

Before the addition, when searching for this incipit, ID 311 was not among the search results. image

After adding the incipit, 311 became the first result. image

Test 2 Took the incipit (፮ተአምሪሃ፡ ለእግዝእትነ፡ ቅድስት፡ ድንግል፡ ማርያም፡ ወላዲተ፤ አምላክ፨ ጸሎታ፤ ወበረከታ፤ የሀሉ፤ ምስለ፤ ገብራ፡ ክርስቶስ፡ ለዓለመ፡ ዓለም፡ አሜን። ወሀሎ፡ ፩፡ መስተፅዕነ፡ ፈረስ፡ ዘስሙ፡ ኒቆዲሞስ፤ ኃጥእ፡ በኵሉ፤ ፍናዊሁ፡ ዓለማዊት፨ ወባሕቱ፤ ጸጋ፤ እግዚኣብሔር፤ ወስእለተ፡ እግዝእትነ፡ ማርያም፡ መርሐቶ፡ ኃበ፡ መድኃኒተ፡ ነፍሱ፡ ወነስሐ፡ በእንተ፡ ኃጢአቱ።) from PEM (PUL) 8, Miracle 6: ff. 10vb-12va (canonical story ID 139). Added every other word of this incipit as the incipit for AECE (HMML) 1, 17vb (canonical story ID 48). Searched for the unmodified incipit before and after the change in the Spreadsheet. Before the change, 48 was not among the search results, 139 was. Test 4

After the change, 48 became the first result (139 became the second result). Test 5

Test 3 Added a garbage incipit (1346fghjr;149gyuhs765b) to C-Berlin (BS) 23, canonical story 103. Before adding the incipit, the search for 1346fghjr;149gyuhs765b didn't produce results: Test 6

After adding the incipit, the search results became Test 7

elambrinaki commented 4 years ago

Test 1 Removed the confidence score from the added to PEM (PUL) 8, Miracle 24, canonical story 311, incipit. Ran the search for the whole incipit. 311 was no among the search results.

Test 2 For the added to AECE (HMML) 1, 17vb, canonical story 48, incipit, changed the confidence score from high to medium. Repeated the search for the whole unmodified incipit. 48 didn't appear among the results.

Test 3 For the added to C-Berlin (BS) 23, canonical story 103, garbage incipit, changed the confidence score to low. The search no longer produced results.

elambrinaki commented 4 years ago

Deleted the last three words in the existing incipit (ንትመየጥኬ፡ ኀበ፡ ዜና፡ ነገር፡ ዘጽኑሕ፡ ለነ፡ በእንተ፡ መቅደስ፡ ዘሐነጸ፡ ሰሎሞን፡ ወልደ፡ ዳዊት፡ ይረድኦ፡ ኪራም፡ ወልደ፡ መበለት), EMML (HMML) 642, 74vb, canonical story 497.

Before the change, the search for the whole incipit was Test 2

The change is visible in the search results: Test 3

elambrinaki commented 4 years ago

Test 1 Searched for the incipit from PEM (PUL) 8, Miracle 10: ff. 17vb-19va (ID 145) ፲፡ ተአምሪሃ፡ ለእግዝእትነ፡ ቅድስት፡ ድንግል፡ ማርያም፡ ጸሎታ፡ ወበረከታ፡ የሀሉ፡ ምስለ፤ ገብራ፤ {er. } ክርስቶስ፡ {er. } ለዓለመ፡ ዓለም፤ አሜን፨ ወሀሎ፤ ፩ቀሲስ፤ ውስተ፡ ፩፡ መካን፤ ኀበ፡ ሀለዉ፤ ብዙኀ፤ ሕዝብ፨ ወአል(f. 18ra)ቦ፤ ካልአ፡ ዘያአምር፤ ዘአንበለ፡ ቅዳሴሃ፤ ለማርያም፡ ወሠናይ፤ ኂሩቱ፡ ፈድፋደ፤ ለውእቱ፡ ቀሲስ፨ ወባሕቱ፤ ኮነ፤ ብእሲ፡ የዋሕ፨ ወኢየአምር፤ መጻሕፍተ፤ Test 8

Removed the existing incipit for canonical story 145 from EMML (HMML) 7543, 90vb (ወሀሎ፡ አሐዱ፡ ካህን፡ ቀሲስ፡ በሀገረ፡ ቂሳርያ፡ ዘስሙ፡ እንድርያስ፡ ወሀሎ፡ በህየ፡ ቤተ፡ ክርስቲያን፡ ዘተሐንጸ፡ በስማ፡ ለእግዝእትነ፡…. ወብዙኅ፡ ሰብእ፡ ካህናት፡ ወዲያቆናት፡ ይትጋብኡ፡ ውስቴታ፡ ወይነግሩ፡ በእንተ፡ ዕበየ፡ ተአምሪሃ፡ ዘትገብር፡ እስመ፡ ውእቱ፡ ቀሲስ፡ ዘስሙ፡ እንድርያስ፡ ኢየአምር፡ ካልአ፡ ቅዳሴ፡ ዘእንበለ፡ ቅዳሴሃ፡ ለእግዝእትነ፡ ማርያም) Repeated the search for the incipit from PEM (PUL) 8. Canonical story 145 is no longer among the search results: Test 9

Test 2 Working with the same incipits, now instead of deleting the incipit, changed its confidence score to medium. The result is as expected: 145 doesn't appear among the search results.

WendyLBelcher commented 4 years ago

Evgeniia tested these, found they all worked, so this issue can be closed.