Closed biojerm closed 4 years ago
Wow, what a great idea. How do I not have support for doing that already!? (I don't). Well, watching hockey right now, so I'll implement support for both of those tomorrow! Thanks Jeremy, that'll be a nice addition!
Great! Thanks, enjoy the game!
Ok, so I've looked at this and have the following info and implementation to throw out and see what you think of it.
First, inencoding= and outencoding= are libname options, and can currently be used to make this work today, with saspy having no knowledge of this. You can assign a libref and specify these encodings, and when you use that libref (in your SASdata object, or in libref= on sd2df and df2sd), SAS will use that encoding to transcode to/from session encoding when reading and writing that data set. That already works today for all of these cases.
Second, the data set option encoding= is used when reading or writing a data set when specified. And, like most options that are scoped hierarchically (sessiom, libname, data set), the encoding= DS option overrides the Libname, and those each override the Session. So, I already have a dsopts dictionary associated to a SASdata object, and dsopts= parm for sd2df methods which is applied to the data set when reading it to return as a data frame. Adding encoding= to the dsopts will make reading and writing this data set use the specified encoding. For df2sd, I would need to add an option for this output encoding, which I would use to write the data and then set in the SASdata dsopts returned by the method, so it would be correct already. I think on df2sd() I would call it outencoding=, even though I'll be setting the encoding= data set option with it, so it's not confusing as to what encoding it's referring to.
I think this is clean in all cases, and allows you to apply this to specific data sets. Of course, I'll have to code it all up and test it all out to be sure all cases work as I expect. That will take longer than just today :) But, this all makes sense in my head, so I don't expect to find any problems with this.
Does this make sense and is it what you're looking for? Is simply assigning a libref with the in/out encodings, which already works an acceptable solution instead of implementing this? Just in case you like that instead, then I don't need to add the rest of this; just checking. But I think adding the encoding to the data set is a reasonable addition, so I'm ok adding it.
Thoughts? Tom
Hey Tom, This is great to know. I think adding the 'outencoding' option would be really nice. While I am messing around it is good to know I can change the encoding using the libref. But based on the datasets I am currently working with there is a pretty good chance I will need to switch between different encodings and having the option within df2sd would be quite convenient.
The implementation you describe above sounds good to me.
Cool, then I'll work on implementing this! To add the outencoding to df2sd will require adding (in)encoding else you'd not be able to access the table you just wrote. I'll post when I have this for you to try out.
ok, actually, no, SAS will see that the encoding is different on input and just transcode without having to be told. But, I still want the encoding in the dsopts so everything works right for other cases. For instance, add_vars() will recreate the data set, so it needs to be explicit about this.
So, I've actually already implemented this and pushed it to a new branch 'outencoding'. Obviously, I haven't fully tested, but I did run through a number of paths and all seemed to work as I expected. Looked at the saslog to verify also. I love it when my architecture allows me to implement something this pervasive so quickly! Having said that, I expect you'll find a problem first try, LOL.
Anyway, I'm gonna go get some lunch, so feel free to grab this code and try it out, FWIW, here's some of the code I tried out. Feel free to explore more and let me know what you see, I'll obviously test this more before merging back into main.
import saspy
sas = saspy.SASsession(cfgname='iomj')
sas
cars = sas.sasdata('cars','sashelp')
df = cars.to_df()
df
cars9 = sas.df2sd(df, 'cars9', outencoding='latin9')
cars9
cars9.head()
cars9.to_df()
cars9.add_vars({'tom':'"hi tom"'})
cars9
cars9.to_df()
print(sas.saslog())
Let me know what you think! Tom
Amazing, I will clone the branch and test it it out.
Hey Tom,
Sorry for the off topic question, but I am having issues installing the package into a virtual env
I am running:
pip install git+https://github.com/sassoftware/saspy.git@outencoding
>>> import saspy
>>> saspy.__version__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: module 'saspy' has no attribute '__version__'
On my old version
>>> import saspy
>>> saspy.__version__
'3.2.0'
Any thoughts on what i am doing wrong? Packaging is always something I mess up.
Hey Jeremy, I'm not really sure offhand. That seems like the right way to grab a branch. One thing that has changed in our repos, and I don't know that it would have any effect on this, is that we've had to change our default branch names from 'master' to 'main', for social equality reasons, of which I'm sure everyone is aware. I don't know how or if that could be causing any issue with pulling a branch like you're doing.
Are you sure the environment you're running in is the one you installed in? Those environments get people all the time. I don't use them, myself. I did just uninstall and reinstalled, using cut-n-paste of your command above, and I'm seeing it run like I expect. Also, I always uninstall first, then install. That's always the cleanest.
tom64-5> pip uninstall saspy
Found existing installation: saspy 3.3.7
Uninstalling saspy-3.3.7:
Would remove:
/net/sanyo.unx.sas.com/vol/vol810/u81/sastpw/.local/lib/python3.5/site-packages/saspy.egg-link
Proceed (y/n)? y
Successfully uninstalled saspy-3.3.7
tom64-5> pwd
/opt/tom/github/saspy/saspy
tom64-5> cd ~
tom64-5> pip install git+https://github.com/sassoftware/saspy.git@outencoding
Defaulting to user installation because normal site-packages is not writeable
Collecting git+https://github.com/sassoftware/saspy.git@outencoding
Cloning https://github.com/sassoftware/saspy.git (to revision outencoding) to /tmp/pip-req-build-n_mh2idj
Running command git clone -q https://github.com/sassoftware/saspy.git /tmp/pip-req-build-n_mh2idj
Running command git checkout -b outencoding --track origin/outencoding
Switched to a new branch 'outencoding'
Branch outencoding set up to track remote branch outencoding from origin.
Building wheels for collected packages: saspy
Building wheel for saspy (setup.py) ... done
Created wheel for saspy: filename=saspy-3.5.0-py3-none-any.whl size=6410742 sha256=5d0d45b96f2312cbdae9549d309db7bcbd4f62ed88751f0040060bbb15c17eda
Stored in directory: /tmp/pip-ephem-wheel-cache-p5x4zs4g/wheels/25/9f/97/3225770836b041942a5227f211cd8a52c15ddbf624a541fabf
Successfully built saspy
Installing collected packages: saspy
Successfully installed saspy-3.5.0
WARNING: You are using pip version 20.1.1; however, version 20.2.3 is available.
You should consider upgrading via the '/usr/bin/python3.5 -m pip install --upgrade pip' command.
tom64-5> python3.5
Python 3.5.6 (default, Nov 16 2018, 15:50:39)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-23)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import saspy
>>> sas =saspy.SASsession(cfgname='iomj')
sas
cars = sas.sasdata('cars','sashelp')
df = cars.to_df()
cars9 = sas.df2sd(df, 'cars9', outencoding='latin9')
cars9
cars9.head()
cars9.add_vars({'tom':'"hi tom"'})
cars9
cars9.to_df()
SAS Connection established. Subprocess id is 28836
No encoding value provided. Will try to determine the correct encoding.
Setting encoding to utf_8 based upon the SAS session encoding value of utf-8.
>>> sas
Access Method = IOM
SAS Config name = iomj
SAS Config file = /net/sanyo.unx.sas.com/vol/vol810/u81/sastpw/sascfg_personal.py
WORK Path = /sastmp/SAS_work322C000047EE_tom64-5/SAS_workD8CC000047EE_tom64-5/
SAS Version = 9.04.01M4D11092016
SASPy Version = 3.5.0
Teach me SAS = False
Batch = False
Results = Pandas
SAS Session Encoding = utf-8
Python Encoding value = utf_8
SAS process Pid value = 18414
>>>
>>> cars = sas.sasdata('cars','sashelp')
>>> df = cars.to_df()
>>>
>>> cars9 = sas.df2sd(df, 'cars9', outencoding='latin9')
>>> cars9
Libref = WORK
Table = cars9
Dsopts = {'encoding': 'latin9'}
Results = Pandas
>>>
>>> cars9.head()
Make Model Type Origin DriveTrain MSRP Invoice EngineSize Cylinders Horsepower MPG_City MPG_Highway Weight Wheelbase Length
0 Acura MDX SUV Asia All 36945 33337 3.5 6 265 17 23 4451 106 189
1 Acura RSX Type S 2dr Sedan Asia Front 23820 21761 2.0 4 200 24 31 2778 101 172
2 Acura TSX 4dr Sedan Asia Front 26990 24647 2.4 4 200 22 29 3230 105 183
3 Acura TL 4dr Sedan Asia Front 33195 30299 3.2 6 270 20 28 3575 108 186
4 Acura 3.5 RL 4dr Sedan Asia Front 43755 39014 3.5 6 225 18 24 3880 115 197
>>>
>>> cars9.add_vars({'tom':'"hi tom"'})
54 The SAS System 14:47 Tuesday, September 15, 2020
809
810 data WORK.'cars9'n (encoding="latin9" ); set WORK.'cars9'n (encoding="latin9" );
NOTE: Data file WORK.CARS9.DATA is in a format that is native to another host, or the file encoding does not match the session
encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
performance.
811 tom = "hi tom";
812 ; run;
NOTE: Data file WORK.CARS9.DATA is in a format that is native to another host, or the file encoding does not match the session
encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
performance.
NOTE: There were 428 observations read from the data set WORK.CARS9.
NOTE: The data set WORK.CARS9 has 428 observations and 16 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.04 seconds
813
814
815
55 The SAS System 14:47 Tuesday, September 15, 2020
816
>>> cars9
Libref = WORK
Table = cars9
Dsopts = {'encoding': 'latin9'}
Results = Pandas
>>> cars9.to_df()
Make Model Type Origin DriveTrain MSRP Invoice EngineSize Cylinders Horsepower MPG_City MPG_Highway Weight Wheelbase Length tom
0 Acura MDX SUV Asia All 36945 33337 3.5 6.0 265 17 23 4451 106 189 hi tom
1 Acura RSX Type S 2dr Sedan Asia Front 23820 21761 2.0 4.0 200 24 31 2778 101 172 hi tom
2 Acura TSX 4dr Sedan Asia Front 26990 24647 2.4 4.0 200 22 29 3230 105 183 hi tom
3 Acura TL 4dr Sedan Asia Front 33195 30299 3.2 6.0 270 20 28 3575 108 186 hi tom
4 Acura 3.5 RL 4dr Sedan Asia Front 43755 39014 3.5 6.0 225 18 24 3880 115 197 hi tom
5 Acura 3.5 RL w/Navigation 4dr Sedan Asia Front 46100 41100 3.5 6.0 225 18 24 3893 115 197 hi tom
6 Acura NSX coupe 2dr manual S Sports Asia Rear 89765 79978 3.2 6.0 290 17 24 3153 100 174 hi tom
7 Audi A4 1.8T 4dr Sedan Europe Front 25940 23508 1.8 4.0 170 22 31 3252 104 179 hi tom
8 Audi A41.8T convertible 2dr Sedan Europe Front 35940 32506 1.8 4.0 170 23 30 3638 105 180 hi tom
9 Audi A4 3.0 4dr Sedan Europe Front 31840 28846 3.0 6.0 220 20 28 3462 104 179 hi tom
10 Audi A4 3.0 Quattro 4dr manual Sedan Europe All 33430 30366 3.0 6.0 220 17 26 3583 104 179 hi tom
11 Audi A4 3.0 Quattro 4dr auto Sedan Europe All 34480 31388 3.0 6.0 220 18 25 3627 104 179 hi tom
12 Audi A6 3.0 4dr Sedan Europe Front 36640 33129 3.0 6.0 220 20 27 3561 109 192 hi tom
13 Audi A6 3.0 Quattro 4dr Sedan Europe All 39640 35992 3.0 6.0 220 18 25 3880 109 192 hi tom
14 Audi A4 3.0 convertible 2dr Sedan Europe Front 42490 38325 3.0 6.0 220 20 27 3814 105 180 hi tom
15 Audi A4 3.0 Quattro convertible 2dr Sedan Europe All 44240 40075 3.0 6.0 220 18 25 4013 105 180 hi tom
16 Audi A6 2.7 Turbo Quattro 4dr Sedan Europe All 42840 38840 2.7 6.0 250 18 25 3836 109 192 hi tom
17 Audi A6 4.2 Quattro 4dr Sedan Europe All 49690 44936 4.2 8.0 300 17 24 4024 109 193 hi tom
18 Audi A8 L Quattro 4dr Sedan Europe All 69190 64740 4.2 8.0 330 17 24 4399 121 204 hi tom
19 Audi S4 Quattro 4dr Sedan Europe All 48040 43556 4.2 8.0 340 14 20 3825 104 179 hi tom
20 Audi RS 6 4dr Sports Europe Front 84600 76417 4.2 8.0 450 15 22 4024 109 191 hi tom
21 Audi TT 1.8 convertible 2dr (coupe) Sports Europe Front 35940 32512 1.8 4.0 180 20 28 3131 95 159 hi tom
22 Audi TT 1.8 Quattro 2dr (convertible) Sports Europe All 37390 33891 1.8 4.0 225 20 28 2921 96 159 hi tom
23 Audi TT 3.2 coupe 2dr (convertible) Sports Europe All 40590 36739 3.2 6.0 250 21 29 3351 96 159 hi tom
24 Audi A6 3.0 Avant Quattro Wagon Europe All 40840 37060 3.0 6.0 220 18 25 4035 109 192 hi tom
25 Audi S4 Avant Quattro Wagon Europe All 49090 44446 4.2 8.0 340 15 21 3936 104 179 hi tom
26 BMW X3 3.0i SUV Europe All 37000 33873 3.0 6.0 225 16 23 4023 110 180 hi tom
27 BMW X5 4.4i SUV Europe All 52195 47720 4.4 8.0 325 16 22 4824 111 184 hi tom
28 BMW 325i 4dr Sedan Europe Rear 28495 26155 2.5 6.0 184 20 29 3219 107 176 hi tom
29 BMW 325Ci 2dr Sedan Europe Rear 30795 28245 2.5 6.0 184 20 29 3197 107 177 hi tom
.. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
398 Toyota Tundra Regular Cab V6 Truck Asia Rear 16495 14978 3.4 6.0 190 16 18 3925 128 218 hi tom
399 Toyota Tundra Access Cab V6 SR5 Truck Asia All 25935 23520 3.4 6.0 190 14 17 4435 128 218 hi tom
400 Toyota Matrix XR Wagon Asia Front 16695 15156 1.8 4.0 130 29 36 2679 102 171 hi tom
401 Volkswagen Touareg V6 SUV Europe All 35515 32243 3.2 6.0 220 15 20 5086 112 187 hi tom
402 Volkswagen Golf GLS 4dr Sedan Europe Front 18715 17478 2.0 4.0 115 24 31 2897 99 165 hi tom
403 Volkswagen GTI 1.8T 2dr hatch Sedan Europe Front 19825 18109 1.8 4.0 180 24 31 2934 99 168 hi tom
404 Volkswagen Jetta GLS TDI 4dr Sedan Europe Front 21055 19638 1.9 4.0 100 38 46 3003 99 172 hi tom
405 Volkswagen New Beetle GLS 1.8T 2dr Sedan Europe Front 21055 19638 1.8 4.0 150 24 31 2820 99 161 hi tom
406 Volkswagen Jetta GLI VR6 4dr Sedan Europe Front 23785 21686 2.8 6.0 200 21 30 3179 99 172 hi tom
407 Volkswagen New Beetle GLS convertible 2dr Sedan Europe Front 23215 21689 2.0 4.0 115 24 30 3082 99 161 hi tom
408 Volkswagen Passat GLS 4dr Sedan Europe Front 23955 21898 1.8 4.0 170 22 31 3241 106 185 hi tom
409 Volkswagen Passat GLX V6 4MOTION 4dr Sedan Europe Front 33180 30583 2.8 6.0 190 19 26 3721 106 185 hi tom
410 Volkswagen Passat W8 4MOTION 4dr Sedan Europe Front 39235 36052 4.0 8.0 270 18 25 3953 106 185 hi tom
411 Volkswagen Phaeton 4dr Sedan Europe Front 65000 59912 4.2 8.0 335 16 22 5194 118 204 hi tom
412 Volkswagen Phaeton W12 4dr Sedan Europe Front 75000 69130 6.0 12.0 420 12 19 5399 118 204 hi tom
413 Volkswagen Jetta GL Wagon Europe Front 19005 17427 2.0 4.0 115 24 30 3034 99 174 hi tom
414 Volkswagen Passat GLS 1.8T Wagon Europe Front 24955 22801 1.8 4.0 170 22 31 3338 106 184 hi tom
415 Volkswagen Passat W8 Wagon Europe Front 40235 36956 4.0 8.0 270 18 25 4067 106 184 hi tom
416 Volvo XC90 T6 SUV Europe All 41250 38851 2.9 6.0 268 15 20 4638 113 189 hi tom
417 Volvo S40 4dr Sedan Europe Front 25135 23701 1.9 4.0 170 22 29 2767 101 178 hi tom
418 Volvo S60 2.5 4dr Sedan Europe All 31745 29916 2.5 5.0 208 20 27 3903 107 180 hi tom
419 Volvo S60 T5 4dr Sedan Europe Front 34845 32902 2.3 5.0 247 20 28 3766 107 180 hi tom
420 Volvo S60 R 4dr Sedan Europe All 37560 35382 2.5 5.0 300 18 25 3571 107 181 hi tom
421 Volvo S80 2.9 4dr Sedan Europe Front 37730 35542 2.9 6.0 208 20 28 3576 110 190 hi tom
422 Volvo S80 2.5T 4dr Sedan Europe All 37885 35688 2.5 5.0 194 20 27 3691 110 190 hi tom
423 Volvo C70 LPT convertible 2dr Sedan Europe Front 40565 38203 2.4 5.0 197 21 28 3450 105 186 hi tom
424 Volvo C70 HPT convertible 2dr Sedan Europe Front 42565 40083 2.3 5.0 242 20 26 3450 105 186 hi tom
425 Volvo S80 T6 4dr Sedan Europe Front 45210 42573 2.9 6.0 268 19 26 3653 110 190 hi tom
426 Volvo V40 Wagon Europe Front 26135 24641 1.9 4.0 170 22 29 2822 101 180 hi tom
427 Volvo XC70 Wagon Europe All 35145 33112 2.5 5.0 208 20 27 3823 109 186 hi tom
[428 rows x 16 columns]
>>>
oh, and
>>> saspy.__version__
'3.5.0'
>>>
got working now, no idea what I was doing wrong. Sometimes just starting over helps. i'll test out now
cool :)
good news and bad news. Using the outencoding
variable seems to work
(Pdb) cars9 = self.session.df2sd(df, 'cars9',outencoding='latin9')
(Pdb) cars9
Libref = WORK
Table = cars9
Dsopts = {'encoding': 'latin9'}
Results = Pandas
So that is great I a getting the encoding I expect. However if I write out the data to a sas7bdat file and check the encoding using interactive sas
proc contents data=jlabarge.cars_latin9; run;
I am told that the encoding is UTF-8 not latin9, in the summary. Any suggestions?:
The CONTENTS Procedure
Data Set Name JLABARGE.CARS_LATIN9 Observations 0
Member Type DATA Variables 15
Engine V9 Indexes 0
Created 09/15/2020 14:03:48 Observation Length 152
Last Modified 09/15/2020 14:03:48 Deleted Observations 0
Protection Compressed NO
Data Set Type Sorted NO
Label
Data Representation SOLARIS_X86_64, LINUX_X86_64, ALPHA_TRU64, LINUX_IA64
Encoding utf-8 Unicode (UTF-8)
for good measure here are my session details
Access Method = STDIO
SAS Config name = sas_u8
SAS Config file = /scratch/jlabarge/io/lio/sas/lio_sascfg.py
WORK Path = /tmp/SAS_work0BE6000010A0_statsrv/
SAS Version = 9.04.01M2P07232014
SASPy Version = 3.5.0
Teach me SAS = False
Batch = False
Results = Pandas
SAS Session Encoding = utf-8
Python Encoding value = utf_8
SAS process Pid value = 4256
well, what is this libref JLABARGE? The dataset you wrote out in latin9 was work.cars9, but the proc contents was on jlabarge.cars_latin9.
when I did a contens() it correctly showed the encoding (sorry this contents output is garbage; it's pandas output); see the last line
>>> cont = cars9.contents()
>>> cont
{'Variables': Member Num Variable Type Len Pos
0 WORK.CARS9 9 Cylinders Num 8 24
1 WORK.CARS9 5 DriveTrain Char 5 144
2 WORK.CARS9 8 EngineSize Num 8 16
3 WORK.CARS9 10 Horsepower Num 8 32
4 WORK.CARS9 7 Invoice Num 8 8
5 WORK.CARS9 15 Length Num 8 72
6 WORK.CARS9 11 MPG_City Num 8 40
7 WORK.CARS9 12 MPG_Highway Num 8 48
8 WORK.CARS9 6 MSRP Num 8 0
9 WORK.CARS9 1 Make Char 13 80
10 WORK.CARS9 2 Model Char 39 93
11 WORK.CARS9 4 Origin Char 6 138
12 WORK.CARS9 3 Type Char 6 132
13 WORK.CARS9 13 Weight Num 8 56
14 WORK.CARS9 14 Wheelbase Num 8 64
15 WORK.CARS9 16 tom Char 6 149, 'Enginehost': Member Label1 cValue1 nValue1
0 WORK.CARS9 Data Set Page Size 65536 6.553600e+04
1 WORK.CARS9 Number of Data Set Pages 2 2.000000e+00
2 WORK.CARS9 First Data Page 1 1.000000e+00
3 WORK.CARS9 Max Obs per Page 409 4.090000e+02
4 WORK.CARS9 Obs in First Data Page 386 3.860000e+02
5 WORK.CARS9 Number of Data Set Repairs 0 0.000000e+00
6 WORK.CARS9 Filename /sastmp/SAS_work322C000047EE_tom64-5/SAS_workC... NaN
7 WORK.CARS9 Release Created 9.0401M4 NaN
8 WORK.CARS9 Host Created Linux NaN
9 WORK.CARS9 Inode Number 2924984313 2.924984e+09
10 WORK.CARS9 Access Permission rw-r--r-- NaN
11 WORK.CARS9 Owner Name sastpw NaN
12 WORK.CARS9 File Size 192KB NaN
13 WORK.CARS9 File Size (bytes) 196608 1.966080e+05, 'Attributes': Member Label1 cValue1 nValue1 Label2 cValue2 nValue2
0 WORK.CARS9 Data Set Name WORK.CARS9 NaN Observations 428 428.0
1 WORK.CARS9 Member Type DATA NaN Variables 16 16.0
2 WORK.CARS9 Engine V9 NaN Indexes 0 0.0
3 WORK.CARS9 Created 09/15/2020 14:12:36 1.915798e+09 Observation Length 160 160.0
4 WORK.CARS9 Last Modified 09/15/2020 14:12:36 1.915798e+09 Deleted Observations 0 0.0
5 WORK.CARS9 Protection NaN NaN Compressed NO NaN
6 WORK.CARS9 Data Set Type NaN NaN Sorted NO NaN
7 WORK.CARS9 Label NaN NaN NaN NaN 0.0
8 WORK.CARS9 Data Representation SOLARIS_X86_64, LINUX_X86_64, ALPHA_TRU64, LIN... NaN NaN NaN 0.0
9 WORK.CARS9 Encoding latin9 European (ISO) NaN
look at your saslog too; I think you're just not referencing the right data set you created
this is easier to read:
>>> cars9
Libref = WORK
Table = cars9
Dsopts = {'encoding': 'latin9'}
Results = Pandas
>>>
>>>> cars9.set_results('text')
>>> cars9.contents()
The SAS System 17:45 Tuesday, September 15, 2020 1
The CONTENTS Procedure
Data Set Name WORK.CARS9 Observations 428
Member Type DATA Variables 16
Engine V9 Indexes 0
Created 09/15/2020 17:45:57 Observation Length 160
Last Modified 09/15/2020 17:45:57 Deleted Observations 0
Protection Compressed NO
Data Set Type Sorted NO
Label
Data Representation SOLARIS_X86_64, LINUX_X86_64, ALPHA_TRU64, LINUX_IA64
Encoding latin9 European (ISO)
yeah that is likely, that i screwed something up I don't fully 'get' libref/table/saslib. Obviously I don't interact with SAS much directly since they appear to be pretty core concepts 😆
Here is the full set of commands I ran:
cars = self.session.sasdata('cars', 'sashelp')
df = cars.to_df()
df
cars9 = self.session.df2sd(df, 'cars9',outencoding='latin9')
self.session.saslib('cars9', path='/scratch/jlabarge')
cars9 = self.session.df2sd(df, table='cars_latin9', libref='cars9',outencoding='latin9')
ok, then run this (I changed results to text so you could read the contents output easier):
cars = self.session.sasdata('cars', 'sashelp', results='text')
cars.contents()
df = cars.to_df()
df
cars9 = self.session.df2sd(df, 'cars9',outencoding='latin9', results='text') # this creates work.cars9
cars9.contents()
self.session.saslib('cars9', path='/scratch/jlabarge') # this is creating a new libref, rather than using work, that's fine
cars9 = self.session.df2sd(df, table='cars_latin9', libref='cars9',outencoding='latin9', results='text') # this creates cars9.cars_latin9
cars9.contents()
thanks Here is my output
cars9.contents()
The SAS System 14:58 Tuesday, Sept
ember 15, 2020 5
The CONTENTS Procedure
Data Set Name CARS9.CARS_LATIN9 Observations
0
Member Type DATA Variables
15
Engine V9 Indexes
0
Created 09/15/2020 14:03:48 Observation Length 152
Last Modified 09/15/2020 14:03:48 Deleted Observations 0
Protection Compressed NO
Data Set Type Sorted NO
Label
Data Representation SOLARIS_X86_64, LINUX_X86_64, ALPHA_TRU64, LINUX_IA64
Encoding utf-8 Unicode (UTF-8)
Engine/Host Dependent Information
Data Set Page Size 65536
Number of Data Set Pages 1
First Data Page 1
Max Obs per Page 430
Obs in First Data Page 0
Number of Data Set Repairs 0
Filename /scratch/jlabarge/cars_latin9.sas7bdat
Release Created 9.0401M2
Host Created Linux
Inode Number 893078406
Access Permission rw-rw-r--
Owner Name jlabarge
File Size (bytes) 131072
still is showing UTF-8.
I'm not sure you're running this code. can you submit this after that last contents?
>>> print(sas.lastlog())
96 The SAS System 17:45 Tuesday, September 15, 2020
1538
1539 proc contents data=perm.'cars9'n (encoding="latin9" );run;
NOTE: Data file PERM.CARS9.DATA is in a format that is native to another host, or the file encoding does not match the session
encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
performance.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.03 seconds
cpu time 0.02 seconds
NOTE: The PROCEDURE CONTENTS printed page 2.
1540
run that whole set of code and then print (sas.saslog()) then we can see what s really happening.
when I said I don't think you're running this code, I mean the source code with this enhancement in it; saspy outencoding branch of code. If we can see the log, at least, it should be apparent.
(Pdb) print (sas.saslog())
*** NameError: name 'sas' is not defined
(Pdb) print (self.session.saslog())
1 The SAS System 15:42 Tuesday, September 15, 2020
NOTE: Unable to open SASUSER.REGSTRY. WORK.REGSTRY will be opened instead.
NOTE: All registry changes will be lost at the end of the session.
WARNING: Unable to copy SASUSER registry to WORK registry. Because of this, you will not see registry customizations during this
session.
NOTE: Copyright (c) 2002-2012 by SAS Institute Inc., Cary, NC, USA.
NOTE: SAS (r) Proprietary Software 9.4 (TS1M2 MBCS3170)
Licensed to FRED HUTCHINSON CNCR RES CTR - SCHARP, Site 70230084.
NOTE: This session is executing on the Linux 4.4.104-18.44-default (LIN X64) platform.
NOTE: Updated analytical products:
SAS/STAT 13.2
SAS/IML 13.2
NOTE: Additional host information:
Linux LIN X64 4.4.104-18.44-default #1 SMP Thu Jan 4 08:07:55 UTC 2018 (05a9de6) x86_64 openSUSE 42.2 (x86_64) VERSION = 42.2
CODENAME = Malachite
NOTE: SAS initialization used:
real time 0.02 seconds
cpu time 0.03 seconds
NOTE: AUTOEXEC processing beginning; file is /usr/local/sas-9.4/SASFoundation/9.4/autoexec.sas.
NOTE: AUTOEXEC processing completed.
1 ;*';*";*/;
2 options svgtitle='svgtitle'; options validvarname=any validmemname=extend; ods graphics on;
3
4 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000001);
E3969440A681A2408885998500000001
5 ;*';*";*/;
6 data _null_; length x $ 4096; file STDERR;
7 x = resolve('%sysfunc(pathname(work))'); put 'WORKPATH=' x 'WORKPATHEND=';
8 x = resolve('&SYSENCODING'); put 'ENCODING=' x 'ENCODINGEND=';
9 x = resolve('&SYSVLONG4'); put 'SYSVLONG=' x 'SYSVLONGEND=';
10 x = resolve('&SYSJOBID'); put 'SYSJOBID=' x 'SYSJOBIDEND=';
11 x = resolve('&SYSSCP'); put 'SYSSCP=' x 'SYSSCPEND=';
12 run;
NOTE: The file STDERR is:
Pipe command="<standard error>"
WORKPATH=/tmp/SAS_work4F980000AEA2_statsrv WORKPATHEND=
ENCODING=utf-8 ENCODINGEND=
SYSVLONG=9.04.01M2P07232014 SYSVLONGEND=
SYSJOBID=44706 SYSJOBIDEND=
SYSSCP=LIN X64 SYSSCPEND=
NOTE: 5 records were written to the file STDERR.
The minimum record length was 25.
The maximum record length was 55.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
13
14
15 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000002);
E3969440A681A2408885998500000002
16 ;*';*";*/;
17 data _null_; file STDERR; put %upcase('col0REG=');
NOTE: The file STDERR is:
Pipe command="<standard error>"
COL0REG=
NOTE: 1 record was written to the file STDERR.
The minimum record length was 8.
The maximum record length was 8.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
18 data _null_;
18 ! put %upcase('col0LOG=');run;
COL0LOG=
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
19
20 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000003);
E3969440A681A2408885998500000003
21 ;*';*";*/;
22 libname outlib '/scratch/jlabarge' ;
NOTE: Libref OUTLIB was successfully assigned as follows:
Engine: V9
Physical Name: /scratch/jlabarge
23
24 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000004);
E3969440A681A2408885998500000004
25 ;*';*";*/;
26 data _null_; e = exist("sashelp.'cars'n");
27 v = exist("sashelp.'cars'n", 'VIEW');
28 if e or v then e = 1;
29 put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
30
31 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000005);
E3969440A681A2408885998500000005
32 ;*';*";*/;
33 data _null_; e = exist("sashelp.'cars'n");
34 v = exist("sashelp.'cars'n", 'VIEW');
35 if e or v then e = 1;
36 put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
37
38 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000006);
E3969440A681A2408885998500000006
39 ;*';*";*/;
40 data _null_; e = exist("sashelp.'cars'n");
41 v = exist("sashelp.'cars'n", 'VIEW');
42 if e or v then e = 1;
43 put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
44
45 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000007);
E3969440A681A2408885998500000007
46 ;*';*";*/;
47 data sasdata2dataframe / view=sasdata2dataframe; set sashelp.'cars'n ;run;
NOTE: DATA STEP view saved on file WORK.SASDATA2DATAFRAME.
NOTE: A stored DATA STEP view cannot run under a different operating system.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
48 data _null_; file STDERR;d = open('sasdata2dataframe');
49 lrecl = attrn(d, 'LRECL'); nvars = attrn(d, 'NVARS');
50 lr='LRECL='; vn='VARNUMS='; vl='VARLIST='; vt='VARTYPE=';
51 put lr lrecl; put vn nvars; put vl;
52 do i = 1 to nvars; var = varname(d, i); put var; end;
53 put vt;
54 do i = 1 to nvars; var = vartype(d, i); put var; end;
55 run;
NOTE: The file STDERR is:
Pipe command="<standard error>"
LRECL= 152
VARNUMS= 15
VARLIST=
Make
Model
Type
Origin
DriveTrain
MSRP
Invoice
EngineSize
Cylinders
Horsepower
MPG_City
MPG_Highway
Weight
Wheelbase
Length
VARTYPE=
C
C
C
C
C
N
N
N
N
N
N
N
N
N
N
NOTE: 34 records were written to the file STDERR.
The minimum record length was 1.
The maximum record length was 11.
NOTE: View WORK.SASDATA2DATAFRAME.VIEW used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
56
57 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000008);
E3969440A681A2408885998500000008
58 ;*';*";*/;
59 data work._n_u_l_l_;output;run;
NOTE: The data set WORK._N_U_L_L_ has 1 observations and 0 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
60 data _null_; file STDERR; set work._n_u_l_l_ sashelp.'cars'n (obs=0 );put 'FMT_CATS=';
61 _tom = vformatn('Make'n);put _tom;
62 _tom = vformatn('Model'n);put _tom;
63 _tom = vformatn('Type'n);put _tom;
64 _tom = vformatn('Origin'n);put _tom;
65 _tom = vformatn('DriveTrain'n);put _tom;
66 _tom = vformatn('MSRP'n);put _tom;
67 _tom = vformatn('Invoice'n);put _tom;
68 _tom = vformatn('EngineSize'n);put _tom;
69 _tom = vformatn('Cylinders'n);put _tom;
70 _tom = vformatn('Horsepower'n);put _tom;
71 _tom = vformatn('MPG_City'n);put _tom;
72 _tom = vformatn('MPG_Highway'n);put _tom;
73 _tom = vformatn('Weight'n);put _tom;
74 _tom = vformatn('Wheelbase'n);put _tom;
75 _tom = vformatn('Length'n);put _tom;
76 run;
NOTE: The file STDERR is:
Pipe command="<standard error>"
FMT_CATS=
$
$
$
$
$
DOLLAR
DOLLAR
BEST
BEST
BEST
BEST
BEST
BEST
BEST
BEST
NOTE: 16 records were written to the file STDERR.
The minimum record length was 1.
The maximum record length was 9.
NOTE: There were 1 observations read from the data set WORK._N_U_L_L_.
NOTE: There were 0 observations read from the data set SASHELP.CARS.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
77 proc delete data=work._n_u_l_l_;run;
NOTE: Deleting WORK._N_U_L_L_ (memtype=DATA).
NOTE: PROCEDURE DELETE used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
78
79 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000009);
E3969440A681A2408885998500000009
80 filename sock socket ':37831' recfm=S lrecl=4096;
81 data _null_; set sashelp.'cars'n ;
82 format 'MSRP'n best32.; format 'Invoice'n best32.; format 'EngineSize'n best32.; format 'Cylinders'n best32.; format
82 ! 'Horsepower'n best32.; format 'MPG_City'n best32.;
83 format 'MPG_Highway'n best32.; format 'Weight'n best32.; format 'Wheelbase'n best32.; format 'Length'n best32.;
84 file sock;
85 'Make'n = translate('Make'n, '2020'x, '0102'x);
86 'Model'n = translate('Model'n, '2020'x, '0102'x); 'Type'n = translate('Type'n, '2020'x, '0102'x); 'Origin'n =
86 ! translate('Origin'n, '2020'x, '0102'x); 'DriveTrain'n = translate('DriveTrain'n, '2020'x, '0102'x);
87 put 'Make'n '02'x;
88 put 'Model'n '02'x; put 'Type'n '02'x; put 'Origin'n '02'x; put 'DriveTrain'n '02'x; put 'MSRP'n '02'x; put 'Invoice'n '02'x;
88 ! put 'EngineSize'n '02'x; put 'Cylinders'n '02'x; put 'Horsepower'n '02'x; put 'MPG_City'n '02'x;
89 put 'MPG_Highway'n '02'x; put 'Weight'n '02'x; put 'Wheelbase'n '02'x; put 'Length'n '01'x; run;
NOTE: The file SOCK is:
Local Host Name=statsrv,
Local Host IP addr=127.0.1.1,
Peer Hostname Name=statsrv.pc.scharp.org,
Peer IP addr=127.0.1.1,Peer Name=N/A,
Peer Portno=37831,Lrecl=4096,Recfm=Stream
NOTE: 6420 records were written to the file SOCK.
The minimum record length was 3.
The maximum record length was 41.
NOTE: There were 428 observations read from the data set SASHELP.CARS.
NOTE: DATA statement used (Total process time):
real time 0.04 seconds
cpu time 0.05 seconds
90 ;*';*";*/;
91
92
93 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000010);
E3969440A681A2408885998500000010
94 data 'cars9'n;
95 (encoding="latin9");
-
180
ERROR 180-322: Statement is not valid or it is used out of proper order.
96 length 'Make'n $13 'Model'n $39 'Type'n $6 'Origin'n $6 'DriveTrain'n $5 'MSRP'n 8 'Invoice'n 8 'EngineSize'n 8 'Cylinders'n 8
96 ! 'Horsepower'n 8 'MPG_City'n 8 'MPG_Highway'n 8 'Weight'n 8 'Wheelbase'n 8 'Length'n 8;
97 infile datalines delimiter='03'x STOPOVER;
98 input @;
99 if _infile_ = '' then delete;
100 input 'Make'n 'Model'n 'Type'n 'Origin'n 'DriveTrain'n 'MSRP'n 'Invoice'n 'EngineSize'n 'Cylinders'n 'Horsepower'n 'MPG_City'n
100! 'MPG_Highway'n 'Weight'n 'Wheelbase'n 'Length'n ;
101 'Make'n = translate('Make'n, '0A'x, '01'x);
102 'Make'n = translate('Make'n, '0D'x, '02'x );
103 'Model'n = translate('Model'n, '0A'x, '01'x);
104 'Model'n = translate('Model'n, '0D'x, '02'x );
105 'Type'n = translate('Type'n, '0A'x, '01'x);
106 'Type'n = translate('Type'n, '0D'x, '02'x );
107 'Origin'n = translate('Origin'n, '0A'x, '01'x);
108 'Origin'n = translate('Origin'n, '0D'x, '02'x );
109 'DriveTrain'n = translate('DriveTrain'n, '0A'x, '01'x);
110 'DriveTrain'n = translate('DriveTrain'n, '0D'x, '02'x );
111 ;
112 datalines4;
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.CARS9 may be incomplete. When this step was stopped there were 0 observations and 15 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
541 ;;;;
542 ;*';*";*/;
543 run;
544
545 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000011);
E3969440A681A2408885998500000011
546 ;*';*";*/;
547 data _null_; e = exist("'cars9'n");
548 v = exist("'cars9'n", 'VIEW');
549 if e or v then e = 1;
550 put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
551
552 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000012);
E3969440A681A2408885998500000012
553 ;*';*";*/;
554 data _null_; e = exist("user.'cars9'n");
555 v = exist("user.'cars9'n", 'VIEW');
556 if e or v then e = 1;
557 put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=0 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
558
559 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000013);
E3969440A681A2408885998500000013
560 ;*';*";*/;
561 libname cars9 '/scratch/jlabarge' ;
NOTE: Libref CARS9 refers to the same physical library as OUTLIB.
NOTE: Libref CARS9 was successfully assigned as follows:
Engine: V9
Physical Name: /scratch/jlabarge
562
563 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000014);
E3969440A681A2408885998500000014
564 ;*';*";*/;
565
566 data _null_; retain libref; retain cobs 1;
567 set sashelp.vlibnam end=last;
568 if cobs EQ 1 then
569 put "LIBREFSSTART=";
570 cobs = 2;
571 if libref NE libname then
572 put %upcase("lib=") libname %upcase('libEND=');
573 libref = libname;
574 if last then
575 put "LIBREFSEND=";
576 run;
LIBREFSSTART=
LIB=OUTLIB LIBEND=
LIB=CARS9 LIBEND=
LIB=SASHELP LIBEND=
LIB=MAPS LIBEND=
LIB=MAPSSAS LIBEND=
LIB=MAPSGFK LIBEND=
LIB=SASUSER LIBEND=
LIB=WORK LIBEND=
LIBREFSEND=
NOTE: There were 55 observations read from the data set SASHELP.VLIBNAM.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
577
578
579 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000015);
E3969440A681A2408885998500000015
580 data cars9.'cars_latin9'n;
581 (encoding="latin9");
-
180
ERROR 180-322: Statement is not valid or it is used out of proper order.
582 length 'Make'n $13 'Model'n $39 'Type'n $6 'Origin'n $6 'DriveTrain'n $5 'MSRP'n 8 'Invoice'n 8 'EngineSize'n 8 'Cylinders'n 8
582! 'Horsepower'n 8 'MPG_City'n 8 'MPG_Highway'n 8 'Weight'n 8 'Wheelbase'n 8 'Length'n 8;
583 infile datalines delimiter='03'x STOPOVER;
584 input @;
585 if _infile_ = '' then delete;
586 input 'Make'n 'Model'n 'Type'n 'Origin'n 'DriveTrain'n 'MSRP'n 'Invoice'n 'EngineSize'n 'Cylinders'n 'Horsepower'n 'MPG_City'n
586! 'MPG_Highway'n 'Weight'n 'Wheelbase'n 'Length'n ;
587 'Make'n = translate('Make'n, '0A'x, '01'x);
588 'Make'n = translate('Make'n, '0D'x, '02'x );
589 'Model'n = translate('Model'n, '0A'x, '01'x);
590 'Model'n = translate('Model'n, '0D'x, '02'x );
591 'Type'n = translate('Type'n, '0A'x, '01'x);
592 'Type'n = translate('Type'n, '0D'x, '02'x );
593 'Origin'n = translate('Origin'n, '0A'x, '01'x);
594 'Origin'n = translate('Origin'n, '0D'x, '02'x );
595 'DriveTrain'n = translate('DriveTrain'n, '0A'x, '01'x);
596 'DriveTrain'n = translate('DriveTrain'n, '0D'x, '02'x );
597 ;
598 datalines4;
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set CARS9.CARS_LATIN9 may be incomplete. When this step was stopped there were 0 observations and 15 variables.
WARNING: Data set CARS9.CARS_LATIN9 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds
1027 ;;;;
1028 ;*';*";*/;
1029 run;
1030
1031 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000016);
E3969440A681A2408885998500000016
1032 ;*';*";*/;
1033 data _null_; e = exist("cars9.'cars_latin9'n");
1034 v = exist("cars9.'cars_latin9'n", 'VIEW');
1035 if e or v then e = 1;
1036 put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
1037
1038 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000017);
E3969440A681A2408885998500000017
1039 ;*';*";*/;
1040 data _null_; e = exist("cars9.'cars_latin9'n");
1041 v = exist("cars9.'cars_latin9'n", 'VIEW');
1042 if e or v then e = 1;
1043 put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
1044
1045 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000018);
E3969440A681A2408885998500000018
1046 ;*';*";*/;
1047 proc contents data=cars9.'cars_latin9'n (encoding="latin9" );run;
NOTE: Data file CARS9.CARS_LATIN9.DATA is in a format that is native to another host, or the file encoding does not match the
session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
performance.
NOTE: The PROCEDURE CONTENTS printed page 1.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
1048
1049 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000019);
E3969440A681A2408885998500000019
1050 ;*';*";*/;
1051 data _null_; e = exist("cars9.'cars_latin9'n");
1052 v = exist("cars9.'cars_latin9'n", 'VIEW');
1053 if e or v then e = 1;
1054 put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
1055
1056 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000020);
E3969440A681A2408885998500000020
1057 ;*';*";*/;
1058 proc contents data=cars9.'cars_latin9'n (encoding="latin9" );run;
NOTE: Data file CARS9.CARS_LATIN9.DATA is in a format that is native to another host, or the file encoding does not match the
session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
performance.
NOTE: The PROCEDURE CONTENTS printed page 2.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
1059
1060 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000021);
E3969440A681A2408885998500000021
1061 ;*';*";*/;
1062 data _null_; e = exist("cars9.'cars_latin9'n");
1063 v = exist("cars9.'cars_latin9'n", 'VIEW');
1064 if e or v then e = 1;
1065 put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
1066
1067 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000022);
E3969440A681A2408885998500000022
1068 ;*';*";*/;
1069 proc contents data=cars9.'cars_latin9'n (encoding="latin9" );run;
NOTE: Data file CARS9.CARS_LATIN9.DATA is in a format that is native to another host, or the file encoding does not match the
session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
performance.
NOTE: The PROCEDURE CONTENTS printed page 3.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
1070
1071 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000023);
E3969440A681A2408885998500000023
1072 ;*';*";*/;
1073 data _null_; e = exist("cars9.'cars_latin9'n");
1074 v = exist("cars9.'cars_latin9'n", 'VIEW');
1075 if e or v then e = 1;
1076 put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
1077
1078 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000024);
E3969440A681A2408885998500000024
1079 ;*';*";*/;
1080 proc contents data=cars9.'cars_latin9'n (encoding="latin9" );run;
NOTE: Data file CARS9.CARS_LATIN9.DATA is in a format that is native to another host, or the file encoding does not match the
session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
performance.
NOTE: The PROCEDURE CONTENTS printed page 4.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.01 seconds
cpu time 0.02 seconds
1081
1082 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000025);
E3969440A681A2408885998500000025
1083 ;*';*";*/;
1084 data _null_; e = exist("cars9.'cars_latin9'n");
1085 v = exist("cars9.'cars_latin9'n", 'VIEW');
1086 if e or v then e = 1;
1087 put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
1088
1089 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000026);
E3969440A681A2408885998500000026
1090 ;*';*";*/;
1091 proc contents data=cars9.'cars_latin9'n (encoding="latin9" );run;
NOTE: Data file CARS9.CARS_LATIN9.DATA is in a format that is native to another host, or the file encoding does not match the
session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
performance.
NOTE: The PROCEDURE CONTENTS printed page 5.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
1092
1093 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000027);
E3969440A681A2408885998500000027
1094 ;*';*";*/;
1095 data _null_; e = exist("cars9.'cars_latin9'n");
1096 v = exist("cars9.'cars_latin9'n", 'VIEW');
1097 if e or v then e = 1;
1098 put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
1099
1100 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000028);
E3969440A681A2408885998500000028
1101 ;*';*";*/;
1102 proc contents data=cars9.'cars_latin9'n (encoding="latin9" );run;
NOTE: Data file CARS9.CARS_LATIN9.DATA is in a format that is native to another host, or the file encoding does not match the
session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
performance.
NOTE: The PROCEDURE CONTENTS printed page 6.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
1103
1104 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000029);
E3969440A681A2408885998500000029
1105 ;*';*";*/;
1106 data _null_; e = exist("cars9.'cars_latin9'n");
1107 v = exist("cars9.'cars_latin9'n", 'VIEW');
1108 if e or v then e = 1;
1109 put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
1110
1111 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000030);
E3969440A681A2408885998500000030
1112 ;*';*";*/;
1113 proc contents data=cars9.'cars_latin9'n (encoding="latin9" );run;
NOTE: Data file CARS9.CARS_LATIN9.DATA is in a format that is native to another host, or the file encoding does not match the
session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
performance.
NOTE: The PROCEDURE CONTENTS printed page 7.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
1114
1115 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000031);
E3969440A681A2408885998500000031
1116 ;*';*";*/;
1117 data _null_; e = exist("cars9.'cars_latin9'n");
1118 v = exist("cars9.'cars_latin9'n", 'VIEW');
1119 if e or v then e = 1;
1120 put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
1121
1122 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000032);
E3969440A681A2408885998500000032
1123 ;*';*";*/;
1124 proc contents data=cars9.'cars_latin9'n (encoding="latin9" );run;
NOTE: Data file CARS9.CARS_LATIN9.DATA is in a format that is native to another host, or the file encoding does not match the
session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
performance.
NOTE: The PROCEDURE CONTENTS printed page 8.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
1125
1126 ;*';*";*/;%put %upcase(e3969440a681a2408885998500000033);
E3969440A681A2408885998500000033
Alright! You found a bug. Firrst, you are running the new code, so that's good. I missed one little thing in the stdio access method; I was running iom. For got to delete the ';\n' in the line before the new code that added the option. That's why you have the error above: (fix is pushed, pull this change and you should be good to go!)
180
ERROR 180-322: Statement is not valid or it is used out of proper order.
diff --git a/saspy/sasiostdio.py b/saspy/sasiostdio.py
index fdb98e2..4205a30 100644
--- a/saspy/sasiostdio.py
+++ b/saspy/sasiostdio.py
@@ -1506,7 +1506,7 @@ Will use HTML5 for this SASsession.""")
code = "data "
if len(libref):
code += libref+"."
- code += "'"+table.strip()+"'n;\n"
+ code += "'"+table.strip()+"'n"
if len(outencoding):
code += '(encoding="'+outencoding+'");\n'
else:
excellent I will give it a try.
works like a charm!Thanks Tom as always your help, patience and responsiveness is greatly appreciated
The SAS System 16:18 Tuesday, September 15, 2020 1
The CONTENTS Procedure
Data Set Name JLABARGE.CARS3_LATIN9 Observations 428
Member Type DATA Variables 15
Engine V9 Indexes 0
Created 09/15/2020 16:18:23 Observation Length 152
Last Modified 09/15/2020 16:18:23 Deleted Observations 0
Protection Compressed NO
Data Set Type Sorted NO
Label
Data Representation SOLARIS_X86_64, LINUX_X86_64, ALPHA_TRU64, LINUX_IA64
Encoding latin9 European (ISO)
I am happy and am ok with closing the issue, do you know when you might merge this into main?
Sweet! Yes, I've run regressions and, now, also the new code with all 3 access methods. I'll merge it in tomorrow. I can build a new release too, as I expect you really want it in a pypi release, not just in the repo. I have a few other things at main that can go into a new release. After I do that tomorrow, I'll post back here and we can close this then. Have a great evening! More hockey to watch, haha😎
Ok, Jeremy, this is merged, pushed, built and out on Pypi as V3.5.1, the current production version. I'll close this now, and let me know when you need something else!
Thanks, Tom
Hi Tom, I am working on saspy package and encountered below error. Could you help me with resolution please? My underlying sas data set is spds format. Code : import saspy import pandas as pd from datetime import datetime sas = saspy.SASsession() sas.saslib('SS','SPDE','hdfs path which stores spds data sets','hdfshost=default') db=sas.sasdata('table','SS').to_df() Error : 'utf-8' codec can't decode byte 0xc3 in position 254179: invalid continuation byte sasdata2dataframe was interupted. Trying to return the saslog instead of a data frame.
sas Access Method = SSH SAS Config name = ssh SAS Config file = /python3.6/site-packages/saspy/sascfg.py WORK Path = <> SAS Version = 9.04.01M7P08052020 SASPy Version = 3.7.2 Teach me SAS = False Batch = False Results = Pandas SAS Session Encoding = utf-8 Python Encoding value = utf-8 SAS process Pid value = 1101
Hey @AnandReddy23, this is an old closed issue. Can you open a new issue for this problem you're having. Thanks! Tom
Is your feature request related to a problem? Please describe. I have a customer that wants a sas7bdat dataset written with 'latin9' encoding Is there a way to specify the output encoding within saspy?
Describe the solution you'd like I would like to specify the OUTENCODING when calling the df2sd method.
If you are also feeling generous, it might also be nice to be able to specify the INENCODING when reading a sas dataset
Describe alternatives you've considered
I have tried modifying the encoding within the pandas df, but that did not help when writing out the sas dataset.
Additional context Add any other context or screenshots about the feature request here.