marcotcr / checklist

Beyond Accuracy: Behavioral Testing of NLP models with CheckList
MIT License
2.01k stars 204 forks source link

Change the data in tests_n500 #47

Closed MartaMarchiori closed 4 years ago

MartaMarchiori commented 4 years ago

Hi! First of all, thanks for your work, it is very inspiring.

I would like to conduct tests on another test set, semantically different from airline related tweets (in particular, I would like to use data from this competition https://amievalita2018.wordpress.com/, which collects misogynistic tweets, in order to explore the fairness of the models).

To do this, I just have to replace the file tests_n500 and insert in the folder "predictions/" the file containing the predictions in the usual format (i.e. the label 0/1/2 and the three probabilities), right?

Excuse the beginner's question :) Thanks a lot!

MartaMarchiori commented 4 years ago

I'm sorry to bother again, but I haven't been able to solve it. It's like it cannot access properly the "new" tests_n500 file and I haven't found where I can specify to use this new evaluation dataset or "where I have to change what". Moreover, I didn't find where I can specify the gold labels for the new customized tests_n500.

Thanks again


IndexError Traceback (most recent call last)

in 1 pred_path = '/Users/Marta/opt/anaconda3/lib/python3.6/site-packages/checklist/release_data/sentiment/predictions/TextBlobPred_AMI2019_en_training.txt' ----> 2 suite.run_from_file(pred_path, overwrite=True) 3 suite.visual_summary_table()

~/opt/anaconda3/lib/python3.6/site-packages/checklist/test_suite.py in run_from_file(self, path, file_format, format_fn, ignore_header, overwrite) 248 format_fn=format_fn, 249 ignore_header=ignore_header) --> 250 self.run_from_preds_confs(preds, confs, overwrite=overwrite) 251 252 def run(self, predict_and_confidence_fn, verbose=True, **kwargs):

~/opt/anaconda3/lib/python3.6/site-packages/checklist/test_suite.py in run_from_preds_confs(self, preds, confs, overwrite) 220 p = preds[slice(self.test_ranges[n])] 221 c = confs[slice(self.test_ranges[n])] --> 222 t.run_from_preds_confs(p, c, overwrite=overwrite) 223 224 def run_from_file(self, path, file_format=None, format_fn=None, ignore_header=False, overwrite=False):

~/opt/anaconda3/lib/python3.6/site-packages/checklist/abstract_test.py in run_from_preds_confs(self, preds, confs, overwrite) 309 self._check_create_results(overwrite) 310 self.update_results_from_preds(preds, confs) --> 311 self.update_expect() 312 313 def run_from_file(self, path, file_format=None, format_fn=None, ignore_header=False, overwrite=False):

~/opt/anaconda3/lib/python3.6/site-packages/checklist/abstract_test.py in update_expect(self) 128 def update_expect(self): 129 self._check_results() --> 130 self.results.expect_results = self.expect(self) 131 self.results.passed = Expect.aggregate(self.results.expect_results, self.agg_fn) 132

/home/marcotcr/work/checklist/checklist/expect.py in expect(self)

/home/marcotcr/work/checklist/checklist/expect.py in (.0)

/home/marcotcr/work/checklist/checklist/expect.py in expect_fn(xs, preds, confs, labels, meta)

IndexError: index 0 is out of bounds for axis 0 with size 0

marcotcr commented 4 years ago

The tests_n500 file contains the examples in the test suite presented in the paper. If you want to test a model with different tests, you have to create your own test or suite, and then you can export it as file if you want to. Take a look at this tutorial.