marcotcr / checklist

Beyond Accuracy: Behavioral Testing of NLP models with CheckList
MIT License
2.01k stars 204 forks source link

Invariance test cannot be run using test.run_from_file() #30

Closed farigys closed 4 years ago

farigys commented 4 years ago

Is there any way I can run Invariance tests from predictions saved in files? When I am trying to use test.run_from_file(), I am getting this following error: AttributeError: 'INV' object has no attribute 'result_indexes'

marcotcr commented 4 years ago

Yes, Invariance tests should work with predictions saved from files. Are you using one of our release suites? If not, can you possibly share the test / suite in question?

farigys commented 4 years ago

I am not using the release suites as my task does not resemble any of the presented task. I am trying to create my own tests (as you've shown in tutorial 3). The first process works (wrapping the prediction model in a PredictionWrapper), but the second process is not. I can share my code here (which is pretty much the exact code in the tutorial):

t = Perturb.perturb(dataset, Perturb.add_typos)
test = INV(**t)
test.run_from_file('/tmp/softmax_preds.txt', file_format='softmax', overwrite=True)
marcotcr commented 4 years ago

You did not call test.to_raw_file. I'm guessing your softmax_preds.txt file contains predictions for the original dataset, but not for the test cases (which include examples with typos)

farigys commented 4 years ago

my softmax_preds.txt contains prediction for the test cases (example + typo perturbed data). I did not save the test cases in /tmp/raw_file.txt. May be that's the problem?

farigys commented 4 years ago

I am saving the raw text using test.to_raw_file and my softmax_preds.txt contains the class probabilities separated by space. I am getting this error:

File "run_checklist_from_saved_prediction.py", line 116, in <module>
    main()
  File "run_checklist_from_saved_prediction.py", line 112, in main
    test.run_from_file('/tmp/softmax_preds.txt', file_format='softmax', overwrite=True)
  File "/Users/fsadeque/Desktop/checklist-master/checklist/abstract_test.py", line 324, in run_from_file
    self.run_from_preds_confs(preds, confs, overwrite=overwrite)
  File "/Users/fsadeque/Desktop/checklist-master/checklist/abstract_test.py", line 293, in run_from_preds_confs
    self.update_expect()
  File "/Users/fsadeque/Desktop/checklist-master/checklist/abstract_test.py", line 129, in update_expect
    self.results.expect_results = self.expect(self)
  File "/Users/fsadeque/Desktop/checklist-master/checklist/expect.py", line 78, in expect
    return [fn(x, pred, confs, labels, meta) for x, pred, confs, labels, meta in zipped]
  File "/Users/fsadeque/Desktop/checklist-master/checklist/expect.py", line 78, in <listcomp>
    return [fn(x, pred, confs, labels, meta) for x, pred, confs, labels, meta in zipped]
  File "/Users/fsadeque/Desktop/checklist-master/checklist/expect.py", line 120, in expect_fn
    orig_pred = preds[0]
TypeError: 'int' object is not subscriptable

Edit: this is how my softmax_preds.txt look like:

0.026281951 0.3031675 0.64833623 0.01582296 0.006391341
0.022482133 0.20569728 0.74865955 0.017401822 0.0057589416
0.011106909 0.058008775 0.46179605 0.4471355 0.021952866
0.01102794 0.037073947 0.41210234 0.5221132 0.017682495
0.5810922 0.1305551 0.12312923 0.07375503 0.09146844
marcotcr commented 4 years ago

Ugh, would it be possible at all for you to share the first few examples from your dataset? i.e. print(dataset[:2])

farigys commented 4 years ago

Sure. This is the output of t.data[:3]:

['they will be different because the water will slow the sound waves down of the pitch making it queiter', 'they will be different ebcause the water will slow the sound waves down of the pitch making it queiter', "The more water there is the more sound will be absorbed by the water's density."]
farigys commented 4 years ago

I am really sorry! Solved it- typos should come in pairs in t, and I wasn't saving them like that. I am closing this issue. Thanks a lot for your help!