tableau / TabPy

Execute Python code on the fly and display results in Tableau visualizations:
https://tableau.github.io/TabPy/
MIT License
1.56k stars 598 forks source link

ResponseError: (500) Error querying GLS {'uri': 'endpointname', 'error': "AttributeError : 'TypeError' object has no attribute 'message'", 'type': 'QueryFailed'} #457

Closed Sahuism closed 4 years ago

Sahuism commented 4 years ago

Environment information:

Describe the issue Unable to run client.query() while deployed function responds correctly with predicted value using the same array.

Following error appears on CMD prompt: TypeError: Object of type 'ndarray' is not JSON serializable

on Jupyter Notebook: ResponseError Traceback (most recent call last)

in 1 #can run queries to TabPy outside Tableau as well. ----> 2 connection.query('Hackathon',1,100,3,2,14,24,0,456,0,0,0,5,'XXXX','XXXX','XXXX','XXXX','XXXX,'XXXX','2015','XXXX','XXXX') C:\Python\envs\myenv\lib\site-packages\tabpy_client\client.py in query(self, name, *args, **kwargs) 184 uuid : a unique id for the request. 185 """ --> 186 return self._service.query(name, *args, **kwargs) 187 188 # C:\Python\envs\myenv\lib\site-packages\tabpy_client\rest_client.py in query(self, name, *args, **kwargs) 151 return self.service_client.POST('query/'+name, 152 data={'data':args or kwargs}, --> 153 timeout=self.query_timeout) 154 155 C:\Python\envs\myenv\lib\site-packages\tabpy_client\rest.py in POST(self, url, data, timeout) 170 def POST(self, url, data=None, timeout=None): 171 """Prepends self.endpoint to the url and issues a POST request.""" --> 172 return self.network_wrapper.POST(self.endpoint + url, data, timeout) 173 174 def PUT(self, url, data=None, timeout=None): C:\Python\envs\myenv\lib\site-packages\tabpy_client\rest.py in POST(self, url, data, timeout) 106 timeout=timeout) 107 if response.status_code not in (200, 201): --> 108 self.raise_error(response) 109 110 return response.json() C:\Python\envs\myenv\lib\site-packages\tabpy_client\rest.py in raise_error(self, response) 61 response.text) 62 ---> 63 raise ResponseError(response) 64 65 def _remove_nones(self, data): ResponseError: (500) Error querying GLS {'uri': 'Hackathon', 'error': "AttributeError : 'TypeError' object has no attribute 'message'", 'type': 'QueryFailed'} **Code Used** def CandidateClassifier(_arg1, _arg2, _arg3, _arg4, _arg5, _arg6, _arg7, _arg8, _arg9, _arg10, _arg11, _arg12, _arg13, _arg14, _arg15, _arg16, _arg17, _arg18, _arg19, _arg20, _arg21): import numpy as np import pandas as pd from catboost import CatBoostClassifier data=np.column_stack([_arg1, _arg2, _arg3, _arg4, _arg5, _arg6, _arg7, _arg8, _arg9, _arg10, _arg11, _arg12, _arg13, _arg14, _arg15, _arg16, _arg17, _arg18, _arg19, _arg20, _arg21]) PredictionModel=model_classifier.predict(data) return PredictionModel #return PredictionModel.tolist() connection.deploy('Hackathon', CandidateClassifier, 'Number of Candidates we can expect in the given settings', #schema=schema, override=True) #run code locally to verify Prediction=CandidateClassifier(1,100,3,2,14,24,0,456,0,0,0,5,'XXXX','XXXX','XXXX','XXXX','XXXX,'XXXX','2015','XXXX','XXXX') #[[3]] #can run queries to TabPy outside Tableau as well. connection.query('Hackathon',1,100,3,2,14,24,0,456,0,0,0,5,'XXXX','XXXX','XXXX','XXXX','XXXX,'XXXX','2015','XXXX','XXXX') **Expected behavior** A response is received in json structure
0golovatyi commented 4 years ago

@Sahuism this error message TypeError: Object of type 'ndarray' is not JSON serializable tells you are trying to return value of a type which can't be serialized to JSON by default. You need to find our how to do the serialization. Is it pandas you are using?

Try something like https://www.bing.com/search?FORM=U510DF&PC=U510&q=python+TypeError%3A+Object+of+type+%27ndarray%27+is+not+JSON+serializable

Sahuism commented 4 years ago

Thanks so much for your reply @0golovatyi While I tried to resolve the this error by doing similar search and .tolist() method was mostly suggested in the blogs and initially it seemed to work for sometime and I was able to get result on Tableau using tabpy.query('endpointname',**args) ['response'][0]. However next day when I restarted everything I ended up getting the error again.

Something seems terribly wrong in below code because with or without tolist() client.query() is not returning anything. JSON serialization is handled by tabpy python scripts itself at C:\Python\envs\myenv\Lib\json\, which contains json python scripts for encoder decoder etc. Therefore using json.dumps(list(arr)) didn't seem necessary. But when I tested it separately to see what returned data looks like using below code:

data=np.column_stack([1,100,3,2,14,24,0,456,0,0,0,5,'XXXX','XXXX','XXXX','XXXX','XXXX,'XXXX','2015','XXXX','XXXX']) print(data) check=model_classifier.predict(data).tolist() json_str=json.dumps(check) json_str

Below is what json_str returns for above array: '[[3]]'

I have used pandas and numpy in Deployed function code. Below is how the deployed function looks like:

import numpy as np import pandas as pd from catboost import CatBoostClassifier

data=np.column_stack([_arg1, _arg2, _arg3, _arg4, _arg5, _arg6, _arg7, _arg8, _arg9, _arg10, _arg11, _arg12, _arg13, _arg14, _arg15, _arg16, _arg17, _arg18, _arg19, _arg20, _arg21])

PredictionModel=model_classifier.predict(data)

return PredictionModel.tolist() #return PredictionModel

However function works fine with/without tolist() and is able to return a value but with tolist() the if else doesn't seem to work. connection.deploy('endpointname',CandidateClassifier,'description',#schema=schema,override=True)

Prediction=CandidateClassifier(1,100,3,2,14,24,0,456,0,0,0,5,'XXXX','XXXX','XXXX','XXXX','XXXX,'XXXX','2015','XXXX','XXXX') if Prediction == 0: print('Twenty and below') elif Prediction == 1: print('Below fifty and above twenty') elif Prediction == 2: print('Below Ninty and above fifty') elif Prediction == 3: print('Ninty and above')

Below is result using "return PredictionModel" in function def Ninty and above array([[3]], dtype=int64)

Below is result using "return PredictionModel.tolist()" in function def [[3]]

In documentation examples several methods were used to pass data and all seemed to work in their own settings when I tried. Also in my case np.column_stack worked once but not anymore. I have hard time in figuring out what best way I should use to pass data in model.predict() to avoid JSON serialization error.

Querying Endpoint locally anyways is not returning any result: connection.query('endpointname',1,100,3,2,14,24,0,456,0,0,0,5,'XXXX','XXXX','XXXX','XXXX','XXXX,'XXXX','2015','XXXX','XXXX')

Thanks in advance! :)

0golovatyi commented 4 years ago

@Sahuism It is not connection.query but tabpy.query as documentation says at https://github.com/tableau/TabPy/blob/master/docs/TableauConfiguration.md#using-deployed-functions

0golovatyi commented 4 years ago

I would guess if/else doesn't work anymore because .tolist() returns a list and you code checks for scalar values.

Sahuism commented 4 years ago

I am sorry

@Sahuism It is not connection.query but tabpy.query as documentation says at https://github.com/tableau/TabPy/blob/master/docs/TableauConfiguration.md#using-deployed-functions

Thanks, I just followed an example where I named tabpy client as 'connection'. But in Tableau I have used tabpy.query only. image

Sahuism commented 4 years ago

I would guess if/else doesn't work anymore because .tolist() returns a list and you code checks for scalar values.

You're absolutely correct, that's exactly what I am getting. How do I fix querying issue because when last time it worked, it worked everywhere. Please help me identify which I can't see.

Appreciate all your quick replies. Thank you so much. :)

0golovatyi commented 4 years ago

@Sahuism Function deployment and querying are completely disconnected. There is no connection at query time, it is tabpy where the function is deployed as documentation explains.

For your list versus scalar question, I would guess you can use [index] syntax to access a scalar value in a list.

nmannheimer commented 4 years ago

So Tableau expects just a list back to correctly parse the return. The original JSON serialization error was coming from a numpy array being returned instead of a list, which is not parsed correctly and is what the tolist() method fixed.

It looks like in some of your examples though you're getting back [[3]] which is a list containing a list containing the integer 3. You'll need to write your code so the final result that comes back to Tableau is just [3]. Tableau will automatically match members of the list to values in the visualization, so you should never need to do )['response'][0] in Tableau.

The code in your last Jupyter screenshot looks like it should work fine.

Sahuism commented 4 years ago

Thank you so much for your inputs @0golovatyi and @nmannheimer .

It worked any ways with the same settings but I am still not sure why exactly it failed to work before. Perhaps I'll investigate and inform you guys later.