CopticScriptorium / corpora

Public repository for Coptic SCRIPTORIUM Corpora Releases
31 stars 13 forks source link

Morph Errors #30

Closed lancealanmartin closed 4 years ago

lancealanmartin commented 5 years ago

All corpora need to be checked for ⲣⲙⲙⲟⲁ. Should be ⲣⲙ-ⲙ-ⲟⲁ.

ctschroeder commented 4 years ago

Hi. I'm going through some of these issues before publication. I'm not seeing ⲣⲙⲙⲟⲁ in our published corpora. Do you remember where you saw it @lancealanmartin? Thank you! https://corpling.uis.georgetown.edu/annis/scriptorium#_q=bm9ybV9ncm91cD0vLirisqPispnispnisp_isoEuKi8&_c=YXBvcGh0aGVnbWF0YS5wYXRydW0sYmVzYS5sZXR0ZXJzLGRvYy5wYXB5cmksam9oYW5uZXMuY2Fub25zLG1hcnR5cmRvbS52aWN0b3IscHNldWRvLnRoZW9waGlsdXMsc2FoaWRpYy5vdCxzYWhpZGljYS4xY29yaW50aGlhbnMsc2FoaWRpY2EubWFyayxzYWhpZGljYS5udCxzaGVub3V0ZS5hMjIsc2hlbm91dGUuYWJyYWhhbSxzaGVub3V0ZS5kaXJ0LHNoZW5vdXRlLmVhZ2VybmVzcyxzaGVub3V0ZS5mb3g&cl=5&cr=5&s=0&l=10&_seg=bm9ybV9ncm91cA

lancealanmartin commented 4 years ago

Sorry! I misspelled the word. It should be ⲣⲙⲙⲁⲟ.

ctschroeder commented 4 years ago

oh goodness I should have seen that. No worries! https://corpling.uis.georgetown.edu/annis/scriptorium#_q=bm9ybT0v4rKj4rKZ4rKZ4rKB4rKfLw&_c=YmVzYS5sZXR0ZXJzLHNoZW5vdXRlLmEyMixqb2hhbm5lcy5jYW5vbnMsc2hlbm91dGUuYWJyYWhhbSxzaGVub3V0ZS5lYWdlcm5lc3Msc2hlbm91dGUuZGlydCxzYWhpZGljLm90LGFwb3BodGhlZ21hdGEucGF0cnVtLHNhaGlkaWNhLm50LHNhaGlkaWNhLjFjb3JpbnRoaWFucyxwc2V1ZG8udGhlb3BoaWx1cyxzaGVub3V0ZS5mb3gsc2FoaWRpY2EubWFyayxkb2MucGFweXJpLG1hcnR5cmRvbS52aWN0b3I&cl=5&cr=5&s=0&l=10&_seg=bm9ybV9ncm91cA 1_id 1_span 1_anno_scriptorium::norm_group meta_title

ctschroeder commented 4 years ago

reopening because I'm seeing this in a doc that was run through NLP before v. 3.0 of the NLP tools. Need to check Johannes corpus texts currently being annotated.

lgessler commented 4 years ago

question--when yall correct these errors where do you correct them? ANNIS, GitDox, somewhere else? Just wanna make sure they'll eventually make it onto the site

lancealanmartin commented 4 years ago

Gitdox!

On Fri, Oct 11, 2019 at 4:02 AM Luke Gessler notifications@github.com wrote:

question--when yall correct these errors where do you correct them? ANNIS, GitDox, somewhere else?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/CopticScriptorium/corpora/issues/30?email_source=notifications&email_token=AM6OXTXAIFZOOTXF3BAGQP3QOAXJJA5CNFSM4IOZOWZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEA7GJPQ#issuecomment-540959934, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM6OXTQH4FSFYH43AWXY3K3QOAXJJANCNFSM4IOZOWZQ .

ctschroeder commented 4 years ago

Yes GitDox. Changes are not/should not be made directly in ANNIS because threats not the core data that gets archived or edited further in future rounds.

Sent from my iPhone

amir-zeldes commented 4 years ago

The publish script pulls data from GitDox and generates the public corpora repo, which includes relANNIS dumps. Those are then imported into ANNIS, while the TT folder is the basis of the CTS interface. So GitDox is where anything that is not added automatically should be corrected (parses for non-treebanked data and MWE are auto-added during publication, and gold parses live in the UD_Coptic repo)

ctschroeder commented 4 years ago

all checked