Closed Sahuism closed 4 years ago
@Sahuism this error message TypeError: Object of type 'ndarray' is not JSON serializable
tells you are trying to return value of a type which can't be serialized to JSON by default. You need to find our how to do the serialization. Is it pandas you are using?
Try something like https://www.bing.com/search?FORM=U510DF&PC=U510&q=python+TypeError%3A+Object+of+type+%27ndarray%27+is+not+JSON+serializable
Thanks so much for your reply @0golovatyi While I tried to resolve the this error by doing similar search and .tolist() method was mostly suggested in the blogs and initially it seemed to work for sometime and I was able to get result on Tableau using tabpy.query('endpointname',**args) ['response'][0]. However next day when I restarted everything I ended up getting the error again.
Something seems terribly wrong in below code because with or without tolist() client.query() is not returning anything. JSON serialization is handled by tabpy python scripts itself at C:\Python\envs\myenv\Lib\json\, which contains json python scripts for encoder decoder etc. Therefore using json.dumps(list(arr)) didn't seem necessary. But when I tested it separately to see what returned data looks like using below code:
data=np.column_stack([1,100,3,2,14,24,0,456,0,0,0,5,'XXXX','XXXX','XXXX','XXXX','XXXX,'XXXX','2015','XXXX','XXXX']) print(data) check=model_classifier.predict(data).tolist() json_str=json.dumps(check) json_str
Below is what json_str returns for above array: '[[3]]'
I have used pandas and numpy in Deployed function code. Below is how the deployed function looks like:
import numpy as np import pandas as pd from catboost import CatBoostClassifier
data=np.column_stack([_arg1, _arg2, _arg3, _arg4, _arg5, _arg6, _arg7, _arg8, _arg9, _arg10, _arg11, _arg12, _arg13, _arg14, _arg15, _arg16, _arg17, _arg18, _arg19, _arg20, _arg21])
PredictionModel=model_classifier.predict(data)
return PredictionModel.tolist() #return PredictionModel
However function works fine with/without tolist() and is able to return a value but with tolist() the if else doesn't seem to work. connection.deploy('endpointname',CandidateClassifier,'description',#schema=schema,override=True)
Prediction=CandidateClassifier(1,100,3,2,14,24,0,456,0,0,0,5,'XXXX','XXXX','XXXX','XXXX','XXXX,'XXXX','2015','XXXX','XXXX') if Prediction == 0: print('Twenty and below') elif Prediction == 1: print('Below fifty and above twenty') elif Prediction == 2: print('Below Ninty and above fifty') elif Prediction == 3: print('Ninty and above')
Below is result using "return PredictionModel" in function def Ninty and above array([[3]], dtype=int64)
Below is result using "return PredictionModel.tolist()" in function def [[3]]
In documentation examples several methods were used to pass data and all seemed to work in their own settings when I tried. Also in my case np.column_stack worked once but not anymore. I have hard time in figuring out what best way I should use to pass data in model.predict() to avoid JSON serialization error.
Querying Endpoint locally anyways is not returning any result: connection.query('endpointname',1,100,3,2,14,24,0,456,0,0,0,5,'XXXX','XXXX','XXXX','XXXX','XXXX,'XXXX','2015','XXXX','XXXX')
Thanks in advance! :)
@Sahuism It is not connection.query
but tabpy.query
as documentation says at https://github.com/tableau/TabPy/blob/master/docs/TableauConfiguration.md#using-deployed-functions
I would guess if/else doesn't work anymore because .tolist()
returns a list and you code checks for scalar values.
I am sorry
@Sahuism It is not
connection.query
buttabpy.query
as documentation says at https://github.com/tableau/TabPy/blob/master/docs/TableauConfiguration.md#using-deployed-functions
Thanks, I just followed an example where I named tabpy client as 'connection'. But in Tableau I have used tabpy.query only.
I would guess if/else doesn't work anymore because
.tolist()
returns a list and you code checks for scalar values.
You're absolutely correct, that's exactly what I am getting. How do I fix querying issue because when last time it worked, it worked everywhere. Please help me identify which I can't see.
Appreciate all your quick replies. Thank you so much. :)
@Sahuism Function deployment and querying are completely disconnected. There is no connection at query time, it is tabpy
where the function is deployed as documentation explains.
For your list versus scalar question, I would guess you can use [index]
syntax to access a scalar value in a list.
So Tableau expects just a list back to correctly parse the return. The original JSON serialization error was coming from a numpy array being returned instead of a list, which is not parsed correctly and is what the tolist() method fixed.
It looks like in some of your examples though you're getting back [[3]] which is a list containing a list containing the integer 3. You'll need to write your code so the final result that comes back to Tableau is just [3]. Tableau will automatically match members of the list to values in the visualization, so you should never need to do )['response'][0] in Tableau.
The code in your last Jupyter screenshot looks like it should work fine.
Thank you so much for your inputs @0golovatyi and @nmannheimer .
It worked any ways with the same settings but I am still not sure why exactly it failed to work before. Perhaps I'll investigate and inform you guys later.
Environment information:
Describe the issue Unable to run client.query() while deployed function responds correctly with predicted value using the same array.
Following error appears on CMD prompt: TypeError: Object of type 'ndarray' is not JSON serializable
on Jupyter Notebook: ResponseError Traceback (most recent call last)