bnosac / udpipe

R package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit
https://bnosac.github.io/udpipe/en
Mozilla Public License 2.0
209 stars 33 forks source link

note to myself #37

Closed jwijffels closed 3 years ago

jwijffels commented 5 years ago

add in docs that cooccurrence.data.frame in a group by fashion which does not take into account a sequence does not return self-occurrences and as there is no order (bag of terms) in the output term1 is always smaller than term2, need to formulate this more concisely while cooccurrence.character goes left to right, maybe need an option right to left also Note in Biterm Topic Modelling (https://github.com/bnosac/BTM) cooccurrences occur in window which is a bit different

jwijffels commented 3 years ago

Documented in https://github.com/bnosac/udpipe/commit/d429a7647544b7b9c1a277d6338aa5ed28987902