SunoikisisDC / SunoikisisDC-2021-2022

11 stars 5 forks source link

Problem with Diorisis Corpus #8

Closed lettychardon closed 2 years ago

lettychardon commented 2 years ago

Hi all,

When I tried to enter a text from the Diorisis corpus into Voyant, the only thing it picked up was the publishing info. So instead of an analysis of the Greek text and Greek words appearing, I just got things like 'university' and 'published'... Has anyone managed to use the corpus successfully?

Thank you and all the best,

Letty

gabrielbodard commented 2 years ago

This is something I was going to demo tomorrow morning, yes. They claim that you can upload XML to Voyant and it knows to ignore tags etc., but I find it messy. What I did was use find-and-replace to clear out everything from the XML file except for the lemmata that I wanted to keep.