niekveldhuis / Digital-Assyriology

Tools and Examples for Computational Text Analysis for Assyriologists.
11 stars 2 forks source link

etcsl scraper #21

Closed niekveldhuis closed 7 years ago

niekveldhuis commented 8 years ago

You should have received the etcsl data by now. Create the same interactive format as you did for oracc. Note that the second notebook (which harmonizes etcsl lemmatization with oracc lemmatization) in part depends on the output format of the scraper.

niekveldhuis commented 8 years ago

Restrict user's options to those that yield useful results. User may filter on language, citation form, guideword or POS, but only 'language' may be included or omitted from the output. Otherwise the transformation from ETCSL to ORACC lemmatization does not work properly.