sassoftware / python-swat

The SAS Scripting Wrapper for Analytics Transfer (SWAT) package is the Python client to SAS Cloud Analytic Services (CAS). It allows users to execute CAS actions and process the results all from Python.
Other
147 stars 63 forks source link

fillna() on CAS column not working #173

Open j-honnacker opened 7 months ago

j-honnacker commented 7 months ago

If I create a CAS table with missing values...

import pandas as pd
import numpy as np

df = pd.DataFrame(dict(amount=[35,40], tip=[3.5,np.nan]))

tbl = conn.upload_frame(df, casout=dict(name="test", caslib="casuser", replace=True))

...the .fillna() method does not replace the missing values:

tbl2 = tbl
tbl2['test'] = tbl2['tip].fillna(0)

image

tbl3 = tbl
tbl3['test'] = tbl3['tip'].fillna(0, inplace=True)

image

Is there any workaround? In this case, I want to add amount with tip and save the result in total_amount:

tbl['total_amount'] = tbl['amount'] + tbl['tip'].fillna(0)
bkemper24 commented 7 months ago

as a work-around, would something like this work for what you are trying to do ?

tbl.computedVarsProgram="if tip=. then total_amount=amount; else total_amount=amount+tip;"
>>> tbl.computedVarsProgram="if tip=. then total_amount=amount; else total_amount=amount+tip;"
>>> tbl
CASTable('TEST', caslib='CASUSER', computedvarsprogram='if tip=. then total_amount=amount; else total_amount=amount+tip;')
>>> tbl.fetch()
CASResults([('Fetch', Selected Rows from Table TEST

   amount  tip  total_amount
0    35.0  3.5          38.5
1    40.0  NaN          40.0)])
j-honnacker commented 7 months ago

Thank you for the suggestion! Unfortunately, I was preparing a demo intended to showcase the extent to which Python syntax can be applied for profiling and cleaning a CAS table. As it often happens, what seemed simple at first (handling missing values) turned out to be (currently) unfeasible :)