bnosac / crfsuite

Labelling Sequential Data in Natural Language Processing with R - using CRFsuite
Other
62 stars 12 forks source link

Intent Classification #11

Closed Btibert3 closed 5 years ago

Btibert3 commented 5 years ago

The suite of tools that you have built is fantastic, especially with the foresight to create a flexdashboard app to help tag entities in text. With that said, you mentioned the ability to do intent classification. Is the idea that the entire text is the chunk with the category applied? I am attempting to walk through your code examples in the README but my R session is hanging.

Obviously not a bug, but more of an information/example request.

jwijffels commented 5 years ago

The idea in intent classification is to extract a certain part of the text (a chunk) that covers a certain intent where intent is a category. You can construct training data to build such a model with the flexdashboard app. Is that clear? Technically intent classification is the same as entity detection. On which example are you stuck?

Btibert3 commented 5 years ago

Got it. In my flow, and based on what I have been exposed to, the chunk would be for NER, but this is a fantastic tool. I got hung up on working through the README when using the english2 data. That said, I could always really shrink the data just to follow along. Simply, I just wanted to work through the codes and mentally map how to make my data fit. It's early days for me, but the tooling is absolutely fantastic. Thanks for putting this out there.

jwijffels commented 5 years ago

Glad you like the toolsets (udpipe / crfsuite / textrank / BTM / tokenizers.bpe / ruimtehol) I've put online. Not sure exactly where you got hung up. But if you are investigating the modelling & modelling flow, just use a sample of that wikipedia extract. Good luck.