t-test on test data - Githubissues

sophieball commented 3 years ago

Perform t-test on test data rather than training data. The t-stats are qualitatively the same (same direction) but some of them are not significant on the held-out test data.

CaptainEmerson commented 3 years ago

Sophie, do you want me to run this as well? If so, what target do you want me to run, and what output do you need?

sophieball commented 3 years ago

Right! Should've mentioned. Please run it on both pushback and linguistic datasets. The target is train_classifier_g and the outputs should be the 2 .log files, the 3 roc_curve_*.png files, and features_xxxx.csv. Thanks~

sophieball commented 3 years ago

Also, @CaptainEmerson , when I was plotting feature importance, I noticed that the Google results do not have some politeness strategies, such as, apologizing, btw.. I used to drop them in convo_politeness.py but not anymore.. When you run the current code, can you check if Apologizing is among the features? Just a search in the .log file is sufficient.

CaptainEmerson commented 3 years ago

Traceback (most recent call last):
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/main/train_classifier_g.runfiles/__main__/main/train_classifier_g.py", line 16, in <module>
    from src import suite
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/main/train_classifier_g.runfiles/__main__/src/suite.py", line 21, in <module>
    from src import convo_politeness
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/main/train_classifier_g.runfiles/__main__/src/convo_politeness.py", line 37, in <module>
    f = open("src/data/speakers_bots_full.list")
FileNotFoundError: [Errno 2] No such file or directory: 'src/data/speakers_bots_full.list'
Traceback (most recent call last):
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/__main__/src/convo_word_freq_diff.py", line 9, in <module>
    from convokit import Corpus, Speaker, Utterance
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__convokit/convokit/__init__.py", line 4, in <module>
    from .politenessStrategies import *
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__convokit/convokit/politenessStrategies/__init__.py", line 1, in <module>
    from .politenessStrategies import *
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__convokit/convokit/politenessStrategies/politenessStrategies.py", line 5, in <module>
    from convokit.text_processing.textParser import process_text
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__convokit/convokit/text_processing/__init__.py", line 2, in <module>
    from .textParser import *
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__convokit/convokit/text_processing/textParser.py", line 2, in <module>
    import spacy
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__spacy/spacy/__init__.py", line 10, in <module>
    from thinc.api import prefer_gpu, require_gpu, require_cpu  # noqa: F401
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__thinc/thinc/api.py", line 6, in <module>
    from .model import Model, serialize_attr, deserialize_attr
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__thinc/thinc/model.py", line 13, in <module>
    from .shims import Shim
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__thinc/thinc/shims/__init__.py", line 2, in <module>
    from .pytorch import PyTorchShim
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__thinc/thinc/shims/pytorch.py", line 18, in <module>
    from .pytorch_grad_scaler import PyTorchGradScaler
ModuleNotFoundError: No module named 'thinc.shims.pytorch_grad_scaler'
There were 22 warnings (use warnings() to see them)

sophieball commented 3 years ago

I tried the current commit in a new directory.. It should work.. No rush, though. This new output won't be too different from the previous one

sophieball commented 3 years ago

Right! Should've mentioned. Please run it on both pushback and linguistic datasets. The target is train_classifier_g and the outputs should be the 2 .log files, the 3 roc_curve_*.png files, and features_xxxx.csv. Thanks~

Added a ROC curve per Bogdan's suggestion. There will be 3 `.png' files.

CaptainEmerson commented 3 years ago

Traceback (most recent call last):
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/__main__/src/convo_word_freq_diff.py", line 9, in <module>
    from convokit import Corpus, Speaker, Utterance
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__convokit/convokit/__init__.py", line 4, in <module>
    from .politenessStrategies import *
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__convokit/convokit/politenessStrategies/__init__.py", line 1, in <module>
    from .politenessStrategies import *
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__convokit/convokit/politenessStrategies/politenessStrategies.py", line 5, in <module>
    from convokit.text_processing.textParser import process_text
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__convokit/convokit/text_processing/__init__.py", line 2, in <module>
    from .textParser import *
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__convokit/convokit/text_processing/textParser.py", line 2, in <module>
    import spacy
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__spacy/spacy/__init__.py", line 10, in <module>
    from thinc.api import prefer_gpu, require_gpu, require_cpu  # noqa: F401
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__thinc/thinc/api.py", line 6, in <module>
    from .model import Model, serialize_attr, deserialize_attr
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__thinc/thinc/model.py", line 13, in <module>
    from .shims import Shim
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__thinc/thinc/shims/__init__.py", line 2, in <module>
    from .pytorch import PyTorchShim
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__thinc/thinc/shims/pytorch.py", line 18, in <module>
    from .pytorch_grad_scaler import PyTorchGradScaler
ModuleNotFoundError: No module named 'thinc.shims.pytorch_grad_scaler'

sophieball commented 3 years ago

Probably because we previously limited thinc's version. It is in the newest version: https://github.com/explosion/thinc/blob/master/thinc/shims/pytorch_grad_scaler.py

I removed the constraint

CaptainEmerson commented 3 years ago

Traceback (most recent call last):
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/__main__/src/convo_word_freq_diff.py", line 9, in <module>
    from convokit import Corpus, Speaker, Utterance
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__convokit/convokit/__init__.py", line 4, in <module>
    from .politenessStrategies import *
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__convokit/convokit/politenessStrategies/__init__.py", line 1, in <module>
    from .politenessStrategies import *
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__convokit/convokit/politenessStrategies/politenessStrategies.py", line 5, in <module>
    from convokit.text_processing.textParser import process_text
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__convokit/convokit/text_processing/__init__.py", line 2, in <module>
    from .textParser import *
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__convokit/convokit/text_processing/textParser.py", line 2, in <module>
    import spacy
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__spacy/spacy/__init__.py", line 10, in <module>
    from thinc.api import prefer_gpu, require_gpu, require_cpu  # noqa: F401
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__thinc/thinc/api.py", line 6, in <module>
    from .model import Model, serialize_attr, deserialize_attr
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__thinc/thinc/model.py", line 13, in <module>
    from .shims import Shim
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__thinc/thinc/shims/__init__.py", line 2, in <module>
    from .pytorch import PyTorchShim
  File "/usr/local/google/home/emersonm/toxicity-detector/bazel-bin/src/convo_word_freq_diff.runfiles/deps/pypi__thinc/thinc/shims/pytorch.py", line 18, in <module>
    from .pytorch_grad_scaler import PyTorchGradScaler
ModuleNotFoundError: No module named 'thinc.shims.pytorch_grad_scaler'

sophieball commented 3 years ago

:/ I force it to be the latest version..

CaptainEmerson commented 3 years ago

I still have that error. It does look like train_classifier.log is created.

sophieball commented 3 years ago

Did you try bazel clean then rebuild?

sophieball commented 3 years ago

@CaptainEmerson When you finish the current run, can you pull the newest commit and run it again? I need all four results (previous commit with G and G-ling, new commit with G and G-ling). Thanks!

CaptainEmerson commented 3 years ago

The dependency issue is fixed, it looks like. But I get:

[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
Log saved in `/usr/local/google/home/emersonm/toxicity-detector/train_classifier.log`
sh: line 1: bazel-bin/src/convo_word_freq_diff: No such file or directory

I'm running from here: setwd("~/toxicity-detector")

sophieball commented 3 years ago

Can you see if src/convo_word_freq_diff.py is there? It's in the repo

sophieball commented 3 years ago

Also, no need to run the old code if you haven't done so. Just run the most up-to-date version.

sophieball commented 3 years ago

The difference is that t-tests show that adjusting for SE words degrades the results.. so I'm reporting results before adjustment.. otherwise I don't know how to defend why we do the adjustment

CaptainEmerson commented 3 years ago

Can you see if src/convo_word_freq_diff.py is there? It's in the repo

Yes, but it's not in bazel-bin. Does main/train_classifier_g depend on src/convo_word_freq_diff.py?

sophieball commented 3 years ago

Yes, it's under feed_data

r_binary(
    name = "feed_data",
    src = "feed_data.R",
    data = [
        ":train_classifier_g",
        ":train_polite_score",
        "//src:convo_word_freq_diff",
        "//src:find_SE_words",
        "//main:train_prompt_types",
        "//main:train_polite_prompt_classifier",
    ],
    deps = [
        ":politeness_logi",
        "@R_plyr",
        "@R_readr",
    ],
)

CaptainEmerson commented 3 years ago

But train_classifier_g doesn't depend on feed_data or convo_word_freq_diff, right?

sophieball commented 3 years ago

Right! feed_data is my own thing. You have something else. Nothing you need to run depends on word_freq_diff

CaptainEmerson commented 3 years ago

Ah, you are right, I think that the dependencies are fine. Do you need new results from convo_word_freq_diff?

sophieball commented 3 years ago

No. I only need the .log and the .pngs

CaptainEmerson commented 3 years ago

The pngs are getting overwritten on each run, right? I did a run yesterday/last night, and it ran both the regular version and then the linguistic version. So while I have the log files from both runs, I guess the PNGs I have are only the linguistic ones.

sophieball commented 3 years ago

I can reconstruct graphs from logs

CaptainEmerson commented 3 years ago

I've uploaded the png files and the logs. I've started a run again with the new build.

Some notes:

Are you always going to want 4 runs from here on out?
The CSV files contain more information than should be written to disc, specifically individual records. Only aggregates should be written.
The log file is getting very large, so large that I can't inspect all of it. Is there a way to slim the size, so that I can inspect what I'm sending to you? For instance, maybe you can split details that are used for debugging from data that you really need for the paper.

sophieball commented 3 years ago

no. just the most recent version on both datasets. I'm changing the logging. Removed to_csv

CaptainEmerson commented 3 years ago

I've upload the newest logs and pngs. I think that's all you need from me right now?

sophieball commented 3 years ago

Yes! Thanks

sophieball commented 2 years ago

@CaptainEmerson Can I merge this?

CaptainEmerson commented 2 years ago

yep, sg

sophieball / toxicity-detector

t-test on test data #106