selik / xport

Python reader and writer for SAS XPORT data transport files.
MIT License
49 stars 24 forks source link

Error while passing missing values in the column to be exported in .xpt file, with SAS DATE9. format #92

Open rekhaoak opened 2 years ago

rekhaoak commented 2 years ago

Thanks a ton for providing xport package for python users!

I am trying to xport numeric dates from python dataframe to SAS version 5 xport file. I am using xport.v56. I am facing a problem here when I have missing values in the column to be exported. I just wanted to know if there is any workaround for this.

Example column: ASTDT (Analysis start date) First I tried to convert it to numeric and then applied DATE9. format (pseucode for ASTDT is shown below). ######################################## for i in range(len(df)): n3_days = df.loc[i,'ASTDT'] - sasdate df.loc[i,'ASTDT'] = n3_days.days

df['ASTDT'] = df['ASTDT'].astype('int64')

for k,v in df.items():
if v.name == 'ASTDT': v.format = 'DATE9.'; ########################################

It works well when there is no missing value in ASTDT column. However, when there is a missing value in the ASTDT column it give below error (shown at the end). I tried to use empty string, different null values like NONE, NAN in place of missing date value. However, error persists.

Brief background about user requirement: This is not a problem when I create SDTM datasets (as dates are always expected to be character values in SDTM datasets). However, the situation becomes problematic with ADaM standard, where dates are expected to be SAS numeric values with a SAS format applied (e.g. Date9, MMDDYY etc). Exported numeric date values in xpt files can not be shown as simple integers. So, while creating sas v5 xpt files for CDISC ADaM datasets, we must pass dates as numeric SAS date values and apply a SAS date format.

#############################

Error

############################# ~\AppData\Roaming\Python\Python39\site-packages\xport\v56.py in ieee_to_ibm(ieee) 893 if isinstance(ieee, xport.NaN): 894 return bytes(ieee) --> 895 if math.isnan(ieee): 896 return b'.' + b'\x00' * 7 897 if math.isinf(ieee):

TypeError: must be real number, not str #############################

A. When empty string is used to represent missing -- It does not convert it to int. It gives error. It does not reach to the code of DATE9 formatting Error: invalid literal for int() with base 10: ''

B. When used None to represent missing -- It gives below error while converting None to int. It does not reach to the code of formatting Error: int() argument must be a string, a bytes-like object or a number, not 'NoneType'

C. If I pass null value as it is without converting column to int then xport package errors out if I format the column to DATE9. SAS Format TypeError: must be real number, not str

##################### Any help with this error will be greatly appreciated.

Thank you and best regards,

Rekha (she /her)