sophieball / toxicity-detector

MIT License
0 stars 0 forks source link

Convokit Prompt types: label prompt types #52

Closed sophieball closed 4 years ago

sophieball commented 4 years ago

Predicting Conversations Gone Awry With Convokit: https://github.com/CornellNLP/Cornell-Conversational-Analysis-Toolkit/blob/master/convokit/forecaster/tests/cumulativeBoW_demo.ipynb

sophieball commented 4 years ago

Update: the technique I put above doesn't work on GH data - it uses unigram and logistic regression - took too long and the program was killed

No model passed to Forecaster. Initializing default forecaster model: Cumulative Bag-of-words...
Initializing default unigram CountVectorizer...
Initializing default classification model (standard scaled logistic regression

Trying this: https://github.com/CornellNLP/Cornell-Conversational-Analysis-Toolkit/tree/master/examples/conversations-gone-awry

This one also looks interesting: https://github.com/CornellNLP/Cornell-Conversational-Analysis-Toolkit/blob/master/examples/hyperconvo/demo_new.ipynb

sophieball commented 4 years ago

Update July 27:

sophieball commented 4 years ago

Update Aug 4: Talked to one of the authors. Confirmed that we've been on the right track. Forgot to ask how to label prompts because we ran out of time.

sophieball commented 4 years ago

@CaptainEmerson Can you run code from PR #72, and copy the bad_conver.log . It should contain something like this. Most are comments, so you don't need to share the doc. Useful information (also less sensible) is things like

                  0         1         2         3         4         5  type_id
do>*       0.675650  0.977704  1.012123  1.033751  1.003250  0.974516      0.0
do>you     0.698812  1.096976  1.074800  1.029700  1.031715  0.970195      0.0
CaptainEmerson commented 4 years ago

On your last day, we looked over this file. I have also set it aside on my desktop, should we need it later. I'll close this issue, but feel free to reopen if it's not done.