Closed bergen288 closed 3 years ago
Well, that's the first time I've seen that. So it's a good news / bas news thing. Good first; I can reproduce that and I'm seeing what you're seeing. Bad; that's a problem I'll have to fix. I believe that's the case that has a special missing value which is being formatted to an '_' instead of a '.' (underscore not a dot). SAS has ~28ish missing values (all special floating point doubles that aren't real numbers). Unfortunately that means I'm going to have to address this one way or another. I need to research this a little to see how the best way it (in SAS or Pandas), and provide a fix for you.
Sorry about that! Tom
@bergen288, I pushed a fix to main for this. Can you give it a go and verify it fixes your case?
I'm run the following to prove this out on my end:
import saspy
sas = saspy.SASsession(results='text')
sas
sas.submitLST("""data a; format x2 datetime26.6;
x1=.; x2=.a; x3=.z; x4=._; c='A'; c2='Z'; c3='_' ; output;
x1=1; x2=99; x3=3; x4=4 ; c='' ; c2='' ; c3='' ; output;
x1=1; x2=.a; x3=3; x4=4 ; c=' '; c2='.'; c3=' '; output;
proc print;run;
""",
results='text', method='logandlist')
sd = sas.sasdata('a')
sd.to_df()
with the following results on all 3 access methods
The SAS System Friday, July 16, 2021 09:06:00 PM 1
Obs x2 x1 x3 x4 c c2 c3
1 A . Z _ A Z _
2 01JAN1960:00:01:39.000000 1 3 4
3 A 1 3 4 .
>>>
>>> sd = sas.sasdata('a')
>>> sd.to_df()
x2 x1 x3 x4 c c2 c3
0 NaT NaN NaN NaN A Z _
1 1960-01-01 00:01:39 1.0 3.0 4.0 NaN NaN NaN
2 NaT 1.0 3.0 4.0 NaN . NaN
>>>
Also, here's your use case (though I'm doing in in line mode so I changed up the way to get the df a little):
>>> stat = sas.sasstat()
>>> res = stat.reg(model='horsepower = Cylinders EngineSize',data=cars)
dir(res)
>>> dir(res)
['ANOVA', 'COOKSDPLOT', 'DFBETASPANEL', 'DFFITSPLOT', 'DIAGNOSTICSPANEL', 'FITSTATISTICS', 'LOG', 'NOBS', 'OBSERVEDBYPREDICTED', 'PARAMETERESTIMATES', 'QQPLOT', 'RESIDUALBOXPLOT', 'RESIDUALBYPREDICTED', 'RESIDUALHISTOGRAM', 'RESIDUALPLOT', 'RFPLOT', 'RSTUDENTBYLEVERAGE', 'RSTUDENTBYPREDICTED']
>>> anova = sas.sasdata('anova','_reg0001')
>>> anova.head()
The SAS System 17:18 Friday, July 16, 2021 3
Obs Source DF SS MS FValue ProbF
1 Model 2.000 1487803.732 743901.866 440.192 0.000
2 Error 423.000 714847.921 1689.948 _ _
3 Corrected Total 425.000 2202651.653 _ _ _
>>>
>>> res.anova
<IPython.core.display.HTML object>
>>> anova.to_df()
Source DF SS MS FValue ProbF
0 Model 2.0 1.487804e+06 743901.866012 440.192215 4.296828e-104
1 Error 423.0 7.148479e+05 1689.947803 NaN NaN
2 Corrected Total 425.0 2.202652e+06 NaN NaN NaN
>>>
I downloaded the newest saspy-main.zip file and re-install it. It looks like the fix is good. Below is my new output with the same Python code:
Using SAS Config named: winiomlinux
SAS Connection established. Subprocess id is 9880
SASPY Connection established: Access Method = IOM
SAS Config name = winiomlinux
SAS Config file = C:\Users\xzhang\sascfg_Windows.py
WORK Path = /work/SAS_workBBAB029001A8_jappsasapp02.onefiserv.net/SAS_work579E029001A8_jappsasapp02.onefiserv.net/
SAS Version = 9.04.01M6P11072018
SASPy Version = 3.7.2
Teach me SAS = False
Batch = False
Results = Pandas
SAS Session Encoding = latin1
Python Encoding value = latin1
SAS process Pid value = 42992040
SAS Stat/ANOVA Analysis Against Cars Data in SAS Help Library
Source DF SS MS FValue ProbF
0 Model 2.00 1487803.73 743901.87 440.19 0.00
1 Error 423.00 714847.92 1689.95 NaN NaN
2 Corrected Total 425.00 2202651.65 NaN NaN NaN
SAS Connection terminated. Subprocess id was 9880
Thank you very much for quick fix, really appreciated.
Describe the bug "saspy" is installed in Anaconda base environment on Windows Server 2016. SAS9.4M6 is on AIX7.2. saspyConnection is my Python class to connect to SAS9.4M6 on AIX. As you can see in the log at the bottom, the connection is successful. I am trying to use "cars" dataset in sashelp library to do ANOVA analysis. See code below:
Unfortunately, it failed with type error. What's wrong with it?
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Additional context Using SAS Config named: winiomlinux SAS Connection established. Subprocess id is 3124
SASPY Connection established: Access Method = IOM SAS Config name = winiomlinux SAS Config file = C:\Users\xzhang\sascfg_Windows.py WORK Path = /work/SAS_work11E201990110_jappsasapp02.onefiserv.net/SAS_work4D1101990110_jappsasapp02.onefiserv.net/ SAS Version = 9.04.01M6P11072018 SASPy Version = 3.7.2 Teach me SAS = False Batch = False Results = Pandas SAS Session Encoding = latin1 Python Encoding value = latin1 SAS process Pid value = 26804496
SAS Stat/ANOVA Analysis Against Cars Data in SAS Help Library Traceback (most recent call last): File "pandas_libs\parsers.pyx", line 1050, in pandas._libs.parsers.TextReader._convert_tokens TypeError: Cannot cast array data from dtype('O') to dtype('float64') according to the rule 'safe'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "e:\Python_Projects\examples\saspy_stat.py", line 9, in
print(stat_results.ANOVA)
File "C:\ProgramData\Anaconda3\lib\site-packages\saspy\sasresults.py", line 74, in getattr
data = self._go_run_code(attr)
File "C:\ProgramData\Anaconda3\lib\site-packages\saspy\sasresults.py", line 111, in _go_run_code
df = self.sas.sasdata2dataframe(attr, libref=lref)
File "C:\ProgramData\Anaconda3\lib\site-packages\saspy\sasbase.py", line 1685, in sasdata2dataframe
df = self._io.sasdata2dataframe(table, libref, dsopts, method=method, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\saspy\sasioiom.py", line 1691, in sasdata2dataframe
return self.sasdata2dataframeDISK(table, libref, dsopts, rowsep, colsep,
File "C:\ProgramData\Anaconda3\lib\site-packages\saspy\sasioiom.py", line 2087, in sasdata2dataframeDISK
df = pd.read_csv(sockout, index_col=idx_col, engine=eng, header=None, names=varlist,
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 610, in read_csv
return _read(filepath_or_buffer, kwds)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 468, in _read
return parser.read(nrows)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 1057, in read
index, columns, col_dict = self._engine.read(nrows)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 2061, in read
data = self._reader.read(nrows)
File "pandas_libs\parsers.pyx", line 756, in pandas._libs.parsers.TextReader.read
File "pandas_libs\parsers.pyx", line 771, in pandas._libs.parsers.TextReader._read_low_memory
File "pandas_libs\parsers.pyx", line 850, in pandas._libs.parsers.TextReader._read_rows
File "pandas_libs\parsers.pyx", line 982, in pandas._libs.parsers.TextReader._convert_column_data
File "pandas_libs\parsers.pyx", line 1056, in pandas._libs.parsers.TextReader._converttokens
ValueError: could not convert string to float: ''
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "e:\Python_Projects\examples\saspy_stat.py", line 9, in
print(stat_results.ANOVA)
File "E:\Python_Projects\Settings\xz_settings.py", line 621, in exit
raise exc_type(excvalue)
ValueError: could not convert string to float: ''