clarin-eric / resource-families-issues

4 stars 0 forks source link

Late 19th- and Early 20th-Century Polish Novels #214

Open jakoble opened 4 years ago

jakoble commented 4 years ago

http://hdl.handle.net/11321/57

tnaskret commented 3 years ago

Can you give us more information on what exactly is the issue?

jakoble commented 3 years ago

Thanks for the comment! Ah, I see it's unclear. What we want to know is the total word/token size (rather than file size) & if the corpus is in any way linguistically annotated. However, looking at the files themselves, it doesn't appear to be annotated (though we believe it's good practice to always explicitly mention the annotation process, even if it doesn't exist, since this is what external users looking for corpora are overwhelmingly interested in).