bnosac / udpipe

R package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit
https://bnosac.github.io/udpipe/en
Mozilla Public License 2.0
209 stars 33 forks source link

avoid confusion if passing on to udpipe in parallel a data.frame with duplicate doc_id's #94

Open jwijffels opened 3 years ago

jwijffels commented 3 years ago

in that case paragraph_id is within chunks of doc_ids across the cores

jwijffels commented 3 years ago

or if no doc_id's are provided, we get duplicate doc_id's in case of parallel annotation