marcotcr / checklist

Beyond Accuracy: Behavioral Testing of NLP models with CheckList
MIT License
2.01k stars 204 forks source link

JSON serialization error when loading examples in visual summary #31

Closed gossminn closed 4 years ago

gossminn commented 4 years ago

When trying to use the visual summary functionality on a TestSuite, I ran into an issue with loading examples: I get the error message ValueError: Can't clean for JSON: array([1.]). I get this both when using suite.visual_summary_table() or suite.visual_summary_by_test().

However, when I try suite.summary() it works fine and I get something like this:

NER test
Test cases:      100
Fails (rate):    4 (4.0%)

Example fails:
0.0 0.0 1.0 Ian Young cooked the burgers in some broth.
----
0.0 0.0 1.0 George Rogers cooked the meats in some broth.
----
0.0 0.0 1.0 Paul Brown cooked the chicken al dente.
----

where the three numbers before every sample are the probability scores (in the case of my model, these are always 1.0 or 0.0).

Is this expected behavior (am I doing something wrong?) or is it a bug?

See traceback from the visualization widget below -- note that the error is raised not when initially loading the widget but only once example fails are being loaded.

ValueError                                Traceback (most recent call last)
~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/checklist/viewer/suite_summarizer.py in handle_events(self, _, content, buffers)
     46         elif content.get('event', '') == 'switch_test':
     47             testname = content.get("testname", "")
---> 48             self.on_select_test(testname)
     49 
     50     def on_select_test(self, testname: str) -> None:

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/checklist/viewer/suite_summarizer.py in on_select_test(self, testname)
     54             summary, testcases = self.select_test_fn(testname)
     55         self.reset_summary(summary)
---> 56         self.reset_testcases(testcases)

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/checklist/viewer/test_summarizer.py in reset_testcases(self, testcases)
     46         self.filtered_testcases = testcases if testcases else []
     47         self.tokenize_testcases()
---> 48         self.search(filter_tags=[], is_fail_case=True)
     49 
     50     def handle_events(self, _, content, buffers):

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/checklist/viewer/test_summarizer.py in search(self, filter_tags, is_fail_case)
    118         self.compute_stats_result(candidate_testcases_not_fail)
    119         self.to_slice_idx = 0
--> 120         self.fetch_example()
    121 
    122     def fetch_example(self):

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/checklist/viewer/test_summarizer.py in fetch_example(self)
    126             new_examples = self.candidate_testcases[self.to_slice_idx : self.to_slice_idx+self.max_return]
    127             self.to_slice_idx += len(new_examples)
--> 128             self.testcases = [e for e in new_examples]

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/traitlets/traitlets.py in __set__(self, obj, value)
    583             raise TraitError('The "%s" trait is read-only.' % self.name)
    584         else:
--> 585             self.set(obj, value)
    586 
    587     def _validate(self, obj, value):

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/traitlets/traitlets.py in set(self, obj, value)
    572             # we explicitly compare silent to True just in case the equality
    573             # comparison above returns something other than True/False
--> 574             obj._notify_trait(self.name, old_value, new_value)
    575 
    576     def __set__(self, obj, value):

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/traitlets/traitlets.py in _notify_trait(self, name, old_value, new_value)
   1137             new=new_value,
   1138             owner=self,
-> 1139             type='change',
   1140         ))
   1141 

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/ipywidgets/widgets/widget.py in notify_change(self, change)
    603             if name in self.keys and self._should_send_property(name, getattr(self, name)):
    604                 # Send new state to front-end
--> 605                 self.send_state(key=name)
    606         super(Widget, self).notify_change(change)
    607 

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/ipywidgets/widgets/widget.py in send_state(self, key)
    487             state, buffer_paths, buffers = _remove_buffers(state)
    488             msg = {'method': 'update', 'state': state, 'buffer_paths': buffer_paths}
--> 489             self._send(msg, buffers=buffers)
    490 
    491 

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/ipywidgets/widgets/widget.py in _send(self, msg, buffers)
    735         """Sends a message to the model in the front-end."""
    736         if self.comm is not None and self.comm.kernel is not None:
--> 737             self.comm.send(data=msg, buffers=buffers)
    738 
    739     def _repr_keys(self):

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/ipykernel/comm/comm.py in send(self, data, metadata, buffers)
    121         """Send a message to the frontend-side version of this comm"""
    122         self._publish_msg('comm_msg',
--> 123             data=data, metadata=metadata, buffers=buffers,
    124         )
    125 

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/ipykernel/comm/comm.py in _publish_msg(self, msg_type, data, metadata, buffers, **keys)
     63         data = {} if data is None else data
     64         metadata = {} if metadata is None else metadata
---> 65         content = json_clean(dict(data=data, comm_id=self.comm_id, **keys))
     66         self.kernel.session.send(self.kernel.iopub_socket, msg_type,
     67             content,

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/ipykernel/jsonutil.py in json_clean(obj)
    189         out = {}
    190         for k,v in iteritems(obj):
--> 191             out[unicode_type(k)] = json_clean(v)
    192         return out
    193     if isinstance(obj, datetime):

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/ipykernel/jsonutil.py in json_clean(obj)
    189         out = {}
    190         for k,v in iteritems(obj):
--> 191             out[unicode_type(k)] = json_clean(v)
    192         return out
    193     if isinstance(obj, datetime):

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/ipykernel/jsonutil.py in json_clean(obj)
    189         out = {}
    190         for k,v in iteritems(obj):
--> 191             out[unicode_type(k)] = json_clean(v)
    192         return out
    193     if isinstance(obj, datetime):

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/ipykernel/jsonutil.py in json_clean(obj)
    175 
    176     if isinstance(obj, list):
--> 177         return [json_clean(x) for x in obj]
    178 
    179     if isinstance(obj, dict):

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/ipykernel/jsonutil.py in <listcomp>(.0)
    175 
    176     if isinstance(obj, list):
--> 177         return [json_clean(x) for x in obj]
    178 
    179     if isinstance(obj, dict):

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/ipykernel/jsonutil.py in json_clean(obj)
    189         out = {}
    190         for k,v in iteritems(obj):
--> 191             out[unicode_type(k)] = json_clean(v)
    192         return out
    193     if isinstance(obj, datetime):

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/ipykernel/jsonutil.py in json_clean(obj)
    175 
    176     if isinstance(obj, list):
--> 177         return [json_clean(x) for x in obj]
    178 
    179     if isinstance(obj, dict):

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/ipykernel/jsonutil.py in <listcomp>(.0)
    175 
    176     if isinstance(obj, list):
--> 177         return [json_clean(x) for x in obj]
    178 
    179     if isinstance(obj, dict):

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/ipykernel/jsonutil.py in json_clean(obj)
    189         out = {}
    190         for k,v in iteritems(obj):
--> 191             out[unicode_type(k)] = json_clean(v)
    192         return out
    193     if isinstance(obj, datetime):

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/ipykernel/jsonutil.py in json_clean(obj)
    189         out = {}
    190         for k,v in iteritems(obj):
--> 191             out[unicode_type(k)] = json_clean(v)
    192         return out
    193     if isinstance(obj, datetime):

~/anaconda3/envs/frameid-checks/lib/python3.7/site-packages/ipykernel/jsonutil.py in json_clean(obj)
    195 
    196     # we don't understand it, it's probably an unserializable object
--> 197     raise ValueError("Can't clean for JSON: %r" % obj)

ValueError: Can't clean for JSON: array([1.])
marcotcr commented 4 years ago

What is the output of your model for a single example (both confidence and prediction)? Is it an np.array of floats?

gossminn commented 4 years ago

print(list(zip(preds, confs))[0]) gives array([1]), array([0., 1., 0.]) So pred is an array of np.int64, conf is an array of floats

marcotcr commented 4 years ago

It seems from your snippet that each of your predictions is an array, rather than an integer. I.e., you have something like [array([1]), array([0]), array([2]), ...]. visual_summary_table fails when trying to convert it to json.

It should be easy to change your prediction function such that preds is an np.array of integers, rather than a list of np.arrays.

gossminn commented 4 years ago

Thanks for the pointer, this was indeed the case!