R package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit
Mozilla Public License 2.0
209
stars
33
forks
source link
avoid confusion if passing on to udpipe in parallel a data.frame with duplicate doc_id's #94
Open
jwijffels opened 3 years ago
in that case paragraph_id is within chunks of doc_ids across the cores