sassoftware / saspy

A Python interface module to the SAS System. It works with Linux, Windows, and Mainframe SAS as well as with SAS in Viya.
https://sassoftware.github.io/saspy
Other
374 stars 149 forks source link

Set the encoding of datasets written by saspy. #317

Closed biojerm closed 4 years ago

biojerm commented 4 years ago

Is your feature request related to a problem? Please describe. I have a customer that wants a sas7bdat dataset written with 'latin9' encoding Is there a way to specify the output encoding within saspy?

Describe the solution you'd like I would like to specify the OUTENCODING when calling the df2sd method.

If you are also feeling generous, it might also be nice to be able to specify the INENCODING when reading a sas dataset

Describe alternatives you've considered

I have tried modifying the encoding within the pandas df, but that did not help when writing out the sas dataset.

Additional context Add any other context or screenshots about the feature request here.

tomweber-sas commented 4 years ago

Wow, what a great idea. How do I not have support for doing that already!? (I don't). Well, watching hockey right now, so I'll implement support for both of those tomorrow! Thanks Jeremy, that'll be a nice addition!

biojerm commented 4 years ago

Great! Thanks, enjoy the game!

tomweber-sas commented 4 years ago

Ok, so I've looked at this and have the following info and implementation to throw out and see what you think of it.

First, inencoding= and outencoding= are libname options, and can currently be used to make this work today, with saspy having no knowledge of this. You can assign a libref and specify these encodings, and when you use that libref (in your SASdata object, or in libref= on sd2df and df2sd), SAS will use that encoding to transcode to/from session encoding when reading and writing that data set. That already works today for all of these cases.

Second, the data set option encoding= is used when reading or writing a data set when specified. And, like most options that are scoped hierarchically (sessiom, libname, data set), the encoding= DS option overrides the Libname, and those each override the Session. So, I already have a dsopts dictionary associated to a SASdata object, and dsopts= parm for sd2df methods which is applied to the data set when reading it to return as a data frame. Adding encoding= to the dsopts will make reading and writing this data set use the specified encoding. For df2sd, I would need to add an option for this output encoding, which I would use to write the data and then set in the SASdata dsopts returned by the method, so it would be correct already. I think on df2sd() I would call it outencoding=, even though I'll be setting the encoding= data set option with it, so it's not confusing as to what encoding it's referring to.

I think this is clean in all cases, and allows you to apply this to specific data sets. Of course, I'll have to code it all up and test it all out to be sure all cases work as I expect. That will take longer than just today :) But, this all makes sense in my head, so I don't expect to find any problems with this.

Does this make sense and is it what you're looking for? Is simply assigning a libref with the in/out encodings, which already works an acceptable solution instead of implementing this? Just in case you like that instead, then I don't need to add the rest of this; just checking. But I think adding the encoding to the data set is a reasonable addition, so I'm ok adding it.

Thoughts? Tom

biojerm commented 4 years ago

Hey Tom, This is great to know. I think adding the 'outencoding' option would be really nice. While I am messing around it is good to know I can change the encoding using the libref. But based on the datasets I am currently working with there is a pretty good chance I will need to switch between different encodings and having the option within df2sd would be quite convenient.

The implementation you describe above sounds good to me.

tomweber-sas commented 4 years ago

Cool, then I'll work on implementing this! To add the outencoding to df2sd will require adding (in)encoding else you'd not be able to access the table you just wrote. I'll post when I have this for you to try out.

tomweber-sas commented 4 years ago

ok, actually, no, SAS will see that the encoding is different on input and just transcode without having to be told. But, I still want the encoding in the dsopts so everything works right for other cases. For instance, add_vars() will recreate the data set, so it needs to be explicit about this.

So, I've actually already implemented this and pushed it to a new branch 'outencoding'. Obviously, I haven't fully tested, but I did run through a number of paths and all seemed to work as I expected. Looked at the saslog to verify also. I love it when my architecture allows me to implement something this pervasive so quickly! Having said that, I expect you'll find a problem first try, LOL.

Anyway, I'm gonna go get some lunch, so feel free to grab this code and try it out, FWIW, here's some of the code I tried out. Feel free to explore more and let me know what you see, I'll obviously test this more before merging back into main.

import saspy
sas = saspy.SASsession(cfgname='iomj')
sas

cars = sas.sasdata('cars','sashelp')
df = cars.to_df()
df

cars9 = sas.df2sd(df, 'cars9', outencoding='latin9')
cars9

cars9.head()
cars9.to_df()

cars9.add_vars({'tom':'"hi tom"'})
cars9
cars9.to_df()

print(sas.saslog())

Let me know what you think! Tom

biojerm commented 4 years ago

Amazing, I will clone the branch and test it it out.

biojerm commented 4 years ago

Hey Tom,

Sorry for the off topic question, but I am having issues installing the package into a virtual env

I am running:

pip install git+https://github.com/sassoftware/saspy.git@outencoding

>>> import saspy
>>> saspy.__version__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'saspy' has no attribute '__version__'

On my old version

>>> import saspy
>>> saspy.__version__
'3.2.0'

Any thoughts on what i am doing wrong? Packaging is always something I mess up.

tomweber-sas commented 4 years ago

Hey Jeremy, I'm not really sure offhand. That seems like the right way to grab a branch. One thing that has changed in our repos, and I don't know that it would have any effect on this, is that we've had to change our default branch names from 'master' to 'main', for social equality reasons, of which I'm sure everyone is aware. I don't know how or if that could be causing any issue with pulling a branch like you're doing.

Are you sure the environment you're running in is the one you installed in? Those environments get people all the time. I don't use them, myself. I did just uninstall and reinstalled, using cut-n-paste of your command above, and I'm seeing it run like I expect. Also, I always uninstall first, then install. That's always the cleanest.

tom64-5> pip uninstall saspy
Found existing installation: saspy 3.3.7
Uninstalling saspy-3.3.7:
  Would remove:
    /net/sanyo.unx.sas.com/vol/vol810/u81/sastpw/.local/lib/python3.5/site-packages/saspy.egg-link
Proceed (y/n)? y
  Successfully uninstalled saspy-3.3.7
tom64-5> pwd
/opt/tom/github/saspy/saspy
tom64-5> cd ~
tom64-5> pip install git+https://github.com/sassoftware/saspy.git@outencoding
Defaulting to user installation because normal site-packages is not writeable
Collecting git+https://github.com/sassoftware/saspy.git@outencoding
  Cloning https://github.com/sassoftware/saspy.git (to revision outencoding) to /tmp/pip-req-build-n_mh2idj
  Running command git clone -q https://github.com/sassoftware/saspy.git /tmp/pip-req-build-n_mh2idj
  Running command git checkout -b outencoding --track origin/outencoding
  Switched to a new branch 'outencoding'
  Branch outencoding set up to track remote branch outencoding from origin.
Building wheels for collected packages: saspy
  Building wheel for saspy (setup.py) ... done
  Created wheel for saspy: filename=saspy-3.5.0-py3-none-any.whl size=6410742 sha256=5d0d45b96f2312cbdae9549d309db7bcbd4f62ed88751f0040060bbb15c17eda
  Stored in directory: /tmp/pip-ephem-wheel-cache-p5x4zs4g/wheels/25/9f/97/3225770836b041942a5227f211cd8a52c15ddbf624a541fabf
Successfully built saspy
Installing collected packages: saspy
Successfully installed saspy-3.5.0
WARNING: You are using pip version 20.1.1; however, version 20.2.3 is available.
You should consider upgrading via the '/usr/bin/python3.5 -m pip install --upgrade pip' command.
tom64-5> python3.5
Python 3.5.6 (default, Nov 16 2018, 15:50:39)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-23)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import saspy
>>> sas =saspy.SASsession(cfgname='iomj')
sas

cars = sas.sasdata('cars','sashelp')
df = cars.to_df()

cars9 = sas.df2sd(df, 'cars9', outencoding='latin9')
cars9

cars9.head()

cars9.add_vars({'tom':'"hi tom"'})
cars9
cars9.to_df()

SAS Connection established. Subprocess id is 28836

No encoding value provided. Will try to determine the correct encoding.
Setting encoding to utf_8 based upon the SAS session encoding value of utf-8.

>>> sas
Access Method         = IOM
SAS Config name       = iomj
SAS Config file       = /net/sanyo.unx.sas.com/vol/vol810/u81/sastpw/sascfg_personal.py
WORK Path             = /sastmp/SAS_work322C000047EE_tom64-5/SAS_workD8CC000047EE_tom64-5/
SAS Version           = 9.04.01M4D11092016
SASPy Version         = 3.5.0
Teach me SAS          = False
Batch                 = False
Results               = Pandas
SAS Session Encoding  = utf-8
Python Encoding value = utf_8
SAS process Pid value = 18414

>>>
>>> cars = sas.sasdata('cars','sashelp')
>>> df = cars.to_df()
>>>
>>> cars9 = sas.df2sd(df, 'cars9', outencoding='latin9')
>>> cars9
Libref  = WORK
Table   = cars9
Dsopts  = {'encoding': 'latin9'}
Results = Pandas

>>>
>>> cars9.head()
    Make           Model   Type Origin DriveTrain   MSRP  Invoice  EngineSize  Cylinders  Horsepower  MPG_City  MPG_Highway  Weight  Wheelbase  Length
0  Acura             MDX    SUV   Asia        All  36945    33337         3.5          6         265        17           23    4451        106     189
1  Acura  RSX Type S 2dr  Sedan   Asia      Front  23820    21761         2.0          4         200        24           31    2778        101     172
2  Acura         TSX 4dr  Sedan   Asia      Front  26990    24647         2.4          4         200        22           29    3230        105     183
3  Acura          TL 4dr  Sedan   Asia      Front  33195    30299         3.2          6         270        20           28    3575        108     186
4  Acura      3.5 RL 4dr  Sedan   Asia      Front  43755    39014         3.5          6         225        18           24    3880        115     197
>>> 
>>> cars9.add_vars({'tom':'"hi tom"'})
54                                                         The SAS System                          14:47 Tuesday, September 15, 2020

809
810        data WORK.'cars9'n (encoding="latin9" ); set WORK.'cars9'n (encoding="latin9" );
NOTE: Data file WORK.CARS9.DATA is in a format that is native to another host, or the file encoding does not match the session
      encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
      performance.
811        tom = "hi tom";
812        ; run;

NOTE: Data file WORK.CARS9.DATA is in a format that is native to another host, or the file encoding does not match the session
      encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
      performance.
NOTE: There were 428 observations read from the data set WORK.CARS9.
NOTE: The data set WORK.CARS9 has 428 observations and 16 variables.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.04 seconds

813
814
815
55                                                         The SAS System                          14:47 Tuesday, September 15, 2020

816
>>> cars9
Libref  = WORK
Table   = cars9
Dsopts  = {'encoding': 'latin9'}
Results = Pandas

>>> cars9.to_df()
           Make                             Model    Type  Origin DriveTrain   MSRP  Invoice  EngineSize  Cylinders  Horsepower  MPG_City  MPG_Highway  Weight  Wheelbase  Length     tom
0         Acura                               MDX     SUV    Asia        All  36945    33337         3.5        6.0         265        17           23    4451        106     189  hi tom
1         Acura                    RSX Type S 2dr   Sedan    Asia      Front  23820    21761         2.0        4.0         200        24           31    2778        101     172  hi tom
2         Acura                           TSX 4dr   Sedan    Asia      Front  26990    24647         2.4        4.0         200        22           29    3230        105     183  hi tom
3         Acura                            TL 4dr   Sedan    Asia      Front  33195    30299         3.2        6.0         270        20           28    3575        108     186  hi tom
4         Acura                        3.5 RL 4dr   Sedan    Asia      Front  43755    39014         3.5        6.0         225        18           24    3880        115     197  hi tom
5         Acura           3.5 RL w/Navigation 4dr   Sedan    Asia      Front  46100    41100         3.5        6.0         225        18           24    3893        115     197  hi tom
6         Acura            NSX coupe 2dr manual S  Sports    Asia       Rear  89765    79978         3.2        6.0         290        17           24    3153        100     174  hi tom
7          Audi                       A4 1.8T 4dr   Sedan  Europe      Front  25940    23508         1.8        4.0         170        22           31    3252        104     179  hi tom
8          Audi            A41.8T convertible 2dr   Sedan  Europe      Front  35940    32506         1.8        4.0         170        23           30    3638        105     180  hi tom
9          Audi                        A4 3.0 4dr   Sedan  Europe      Front  31840    28846         3.0        6.0         220        20           28    3462        104     179  hi tom
10         Audi         A4 3.0 Quattro 4dr manual   Sedan  Europe        All  33430    30366         3.0        6.0         220        17           26    3583        104     179  hi tom
11         Audi           A4 3.0 Quattro 4dr auto   Sedan  Europe        All  34480    31388         3.0        6.0         220        18           25    3627        104     179  hi tom
12         Audi                        A6 3.0 4dr   Sedan  Europe      Front  36640    33129         3.0        6.0         220        20           27    3561        109     192  hi tom
13         Audi                A6 3.0 Quattro 4dr   Sedan  Europe        All  39640    35992         3.0        6.0         220        18           25    3880        109     192  hi tom
14         Audi            A4 3.0 convertible 2dr   Sedan  Europe      Front  42490    38325         3.0        6.0         220        20           27    3814        105     180  hi tom
15         Audi    A4 3.0 Quattro convertible 2dr   Sedan  Europe        All  44240    40075         3.0        6.0         220        18           25    4013        105     180  hi tom
16         Audi          A6 2.7 Turbo Quattro 4dr   Sedan  Europe        All  42840    38840         2.7        6.0         250        18           25    3836        109     192  hi tom
17         Audi                A6 4.2 Quattro 4dr   Sedan  Europe        All  49690    44936         4.2        8.0         300        17           24    4024        109     193  hi tom
18         Audi                  A8 L Quattro 4dr   Sedan  Europe        All  69190    64740         4.2        8.0         330        17           24    4399        121     204  hi tom
19         Audi                    S4 Quattro 4dr   Sedan  Europe        All  48040    43556         4.2        8.0         340        14           20    3825        104     179  hi tom
20         Audi                          RS 6 4dr  Sports  Europe      Front  84600    76417         4.2        8.0         450        15           22    4024        109     191  hi tom
21         Audi    TT 1.8 convertible 2dr (coupe)  Sports  Europe      Front  35940    32512         1.8        4.0         180        20           28    3131         95     159  hi tom
22         Audi  TT 1.8 Quattro 2dr (convertible)  Sports  Europe        All  37390    33891         1.8        4.0         225        20           28    2921         96     159  hi tom
23         Audi    TT 3.2 coupe 2dr (convertible)  Sports  Europe        All  40590    36739         3.2        6.0         250        21           29    3351         96     159  hi tom
24         Audi              A6 3.0 Avant Quattro   Wagon  Europe        All  40840    37060         3.0        6.0         220        18           25    4035        109     192  hi tom
25         Audi                  S4 Avant Quattro   Wagon  Europe        All  49090    44446         4.2        8.0         340        15           21    3936        104     179  hi tom
26          BMW                           X3 3.0i     SUV  Europe        All  37000    33873         3.0        6.0         225        16           23    4023        110     180  hi tom
27          BMW                           X5 4.4i     SUV  Europe        All  52195    47720         4.4        8.0         325        16           22    4824        111     184  hi tom
28          BMW                          325i 4dr   Sedan  Europe       Rear  28495    26155         2.5        6.0         184        20           29    3219        107     176  hi tom
29          BMW                         325Ci 2dr   Sedan  Europe       Rear  30795    28245         2.5        6.0         184        20           29    3197        107     177  hi tom
..          ...                               ...     ...     ...        ...    ...      ...         ...        ...         ...       ...          ...     ...        ...     ...     ...
398      Toyota             Tundra Regular Cab V6   Truck    Asia       Rear  16495    14978         3.4        6.0         190        16           18    3925        128     218  hi tom
399      Toyota          Tundra Access Cab V6 SR5   Truck    Asia        All  25935    23520         3.4        6.0         190        14           17    4435        128     218  hi tom
400      Toyota                         Matrix XR   Wagon    Asia      Front  16695    15156         1.8        4.0         130        29           36    2679        102     171  hi tom
401  Volkswagen                        Touareg V6     SUV  Europe        All  35515    32243         3.2        6.0         220        15           20    5086        112     187  hi tom
402  Volkswagen                      Golf GLS 4dr   Sedan  Europe      Front  18715    17478         2.0        4.0         115        24           31    2897         99     165  hi tom
403  Volkswagen                GTI 1.8T 2dr hatch   Sedan  Europe      Front  19825    18109         1.8        4.0         180        24           31    2934         99     168  hi tom
404  Volkswagen                 Jetta GLS TDI 4dr   Sedan  Europe      Front  21055    19638         1.9        4.0         100        38           46    3003         99     172  hi tom
405  Volkswagen           New Beetle GLS 1.8T 2dr   Sedan  Europe      Front  21055    19638         1.8        4.0         150        24           31    2820         99     161  hi tom
406  Volkswagen                 Jetta GLI VR6 4dr   Sedan  Europe      Front  23785    21686         2.8        6.0         200        21           30    3179         99     172  hi tom
407  Volkswagen    New Beetle GLS convertible 2dr   Sedan  Europe      Front  23215    21689         2.0        4.0         115        24           30    3082         99     161  hi tom
408  Volkswagen                    Passat GLS 4dr   Sedan  Europe      Front  23955    21898         1.8        4.0         170        22           31    3241        106     185  hi tom
409  Volkswagen         Passat GLX V6 4MOTION 4dr   Sedan  Europe      Front  33180    30583         2.8        6.0         190        19           26    3721        106     185  hi tom
410  Volkswagen             Passat W8 4MOTION 4dr   Sedan  Europe      Front  39235    36052         4.0        8.0         270        18           25    3953        106     185  hi tom
411  Volkswagen                       Phaeton 4dr   Sedan  Europe      Front  65000    59912         4.2        8.0         335        16           22    5194        118     204  hi tom
412  Volkswagen                   Phaeton W12 4dr   Sedan  Europe      Front  75000    69130         6.0       12.0         420        12           19    5399        118     204  hi tom
413  Volkswagen                          Jetta GL   Wagon  Europe      Front  19005    17427         2.0        4.0         115        24           30    3034         99     174  hi tom
414  Volkswagen                   Passat GLS 1.8T   Wagon  Europe      Front  24955    22801         1.8        4.0         170        22           31    3338        106     184  hi tom
415  Volkswagen                         Passat W8   Wagon  Europe      Front  40235    36956         4.0        8.0         270        18           25    4067        106     184  hi tom
416       Volvo                           XC90 T6     SUV  Europe        All  41250    38851         2.9        6.0         268        15           20    4638        113     189  hi tom
417       Volvo                           S40 4dr   Sedan  Europe      Front  25135    23701         1.9        4.0         170        22           29    2767        101     178  hi tom
418       Volvo                       S60 2.5 4dr   Sedan  Europe        All  31745    29916         2.5        5.0         208        20           27    3903        107     180  hi tom
419       Volvo                        S60 T5 4dr   Sedan  Europe      Front  34845    32902         2.3        5.0         247        20           28    3766        107     180  hi tom
420       Volvo                         S60 R 4dr   Sedan  Europe        All  37560    35382         2.5        5.0         300        18           25    3571        107     181  hi tom
421       Volvo                       S80 2.9 4dr   Sedan  Europe      Front  37730    35542         2.9        6.0         208        20           28    3576        110     190  hi tom
422       Volvo                      S80 2.5T 4dr   Sedan  Europe        All  37885    35688         2.5        5.0         194        20           27    3691        110     190  hi tom
423       Volvo           C70 LPT convertible 2dr   Sedan  Europe      Front  40565    38203         2.4        5.0         197        21           28    3450        105     186  hi tom
424       Volvo           C70 HPT convertible 2dr   Sedan  Europe      Front  42565    40083         2.3        5.0         242        20           26    3450        105     186  hi tom
425       Volvo                        S80 T6 4dr   Sedan  Europe      Front  45210    42573         2.9        6.0         268        19           26    3653        110     190  hi tom
426       Volvo                               V40   Wagon  Europe      Front  26135    24641         1.9        4.0         170        22           29    2822        101     180  hi tom
427       Volvo                              XC70   Wagon  Europe        All  35145    33112         2.5        5.0         208        20           27    3823        109     186  hi tom

[428 rows x 16 columns]
>>>
tomweber-sas commented 4 years ago

oh, and

>>> saspy.__version__
'3.5.0'
>>>
biojerm commented 4 years ago

got working now, no idea what I was doing wrong. Sometimes just starting over helps. i'll test out now

tomweber-sas commented 4 years ago

cool :)

biojerm commented 4 years ago

good news and bad news. Using the outencoding variable seems to work

(Pdb) cars9 = self.session.df2sd(df, 'cars9',outencoding='latin9')
(Pdb) cars9
Libref  = WORK
Table   = cars9
Dsopts  = {'encoding': 'latin9'}
Results = Pandas

So that is great I a getting the encoding I expect. However if I write out the data to a sas7bdat file and check the encoding using interactive sas

proc contents data=jlabarge.cars_latin9; run;

I am told that the encoding is UTF-8 not latin9, in the summary. Any suggestions?:

                                           The CONTENTS Procedure

                    Data Set Name        JLABARGE.CARS_LATIN9                                     Observations          0
                    Member Type          DATA                                                     Variables             15
                    Engine               V9                                                       Indexes               0
                    Created              09/15/2020 14:03:48                                      Observation Length    152
                    Last Modified        09/15/2020 14:03:48                                      Deleted Observations  0
                    Protection                                                                    Compressed            NO
                    Data Set Type                                                                 Sorted                NO
                    Label
                    Data Representation  SOLARIS_X86_64, LINUX_X86_64, ALPHA_TRU64, LINUX_IA64
                    Encoding             utf-8  Unicode (UTF-8)

for good measure here are my session details

Access Method         = STDIO
SAS Config name       = sas_u8
SAS Config file       = /scratch/jlabarge/io/lio/sas/lio_sascfg.py
WORK Path             = /tmp/SAS_work0BE6000010A0_statsrv/
SAS Version           = 9.04.01M2P07232014
SASPy Version         = 3.5.0
Teach me SAS          = False
Batch                 = False
Results               = Pandas
SAS Session Encoding  = utf-8
Python Encoding value = utf_8
SAS process Pid value = 4256
tomweber-sas commented 4 years ago

well, what is this libref JLABARGE? The dataset you wrote out in latin9 was work.cars9, but the proc contents was on jlabarge.cars_latin9.

when I did a contens() it correctly showed the encoding (sorry this contents output is garbage; it's pandas output); see the last line

>>> cont = cars9.contents()
>>> cont
{'Variables':         Member  Num     Variable  Type  Len  Pos
0   WORK.CARS9    9    Cylinders   Num    8   24
1   WORK.CARS9    5   DriveTrain  Char    5  144
2   WORK.CARS9    8   EngineSize   Num    8   16
3   WORK.CARS9   10   Horsepower   Num    8   32
4   WORK.CARS9    7      Invoice   Num    8    8
5   WORK.CARS9   15       Length   Num    8   72
6   WORK.CARS9   11     MPG_City   Num    8   40
7   WORK.CARS9   12  MPG_Highway   Num    8   48
8   WORK.CARS9    6         MSRP   Num    8    0
9   WORK.CARS9    1         Make  Char   13   80
10  WORK.CARS9    2        Model  Char   39   93
11  WORK.CARS9    4       Origin  Char    6  138
12  WORK.CARS9    3         Type  Char    6  132
13  WORK.CARS9   13       Weight   Num    8   56
14  WORK.CARS9   14    Wheelbase   Num    8   64
15  WORK.CARS9   16          tom  Char    6  149, 'Enginehost':         Member                      Label1                                            cValue1       nValue1
0   WORK.CARS9          Data Set Page Size                                              65536  6.553600e+04
1   WORK.CARS9    Number of Data Set Pages                                                  2  2.000000e+00
2   WORK.CARS9             First Data Page                                                  1  1.000000e+00
3   WORK.CARS9            Max Obs per Page                                                409  4.090000e+02
4   WORK.CARS9      Obs in First Data Page                                                386  3.860000e+02
5   WORK.CARS9  Number of Data Set Repairs                                                  0  0.000000e+00
6   WORK.CARS9                    Filename  /sastmp/SAS_work322C000047EE_tom64-5/SAS_workC...           NaN
7   WORK.CARS9             Release Created                                           9.0401M4           NaN
8   WORK.CARS9                Host Created                                              Linux           NaN
9   WORK.CARS9                Inode Number                                         2924984313  2.924984e+09
10  WORK.CARS9           Access Permission                                          rw-r--r--           NaN
11  WORK.CARS9                  Owner Name                                             sastpw           NaN
12  WORK.CARS9                   File Size                                              192KB           NaN
13  WORK.CARS9           File Size (bytes)                                             196608  1.966080e+05, 'Attributes':        Member               Label1                                            cValue1       nValue1                Label2 cValue2  nValue2
0  WORK.CARS9        Data Set Name                                         WORK.CARS9           NaN          Observations     428    428.0
1  WORK.CARS9          Member Type                                               DATA           NaN             Variables      16     16.0
2  WORK.CARS9               Engine                                                 V9           NaN               Indexes       0      0.0
3  WORK.CARS9              Created                                09/15/2020 14:12:36  1.915798e+09    Observation Length     160    160.0
4  WORK.CARS9        Last Modified                                09/15/2020 14:12:36  1.915798e+09  Deleted Observations       0      0.0
5  WORK.CARS9           Protection                                                NaN           NaN            Compressed      NO      NaN
6  WORK.CARS9        Data Set Type                                                NaN           NaN                Sorted      NO      NaN
7  WORK.CARS9                Label                                                NaN           NaN                   NaN     NaN      0.0
8  WORK.CARS9  Data Representation  SOLARIS_X86_64, LINUX_X86_64, ALPHA_TRU64, LIN...           NaN                   NaN     NaN      0.0
9  WORK.CARS9             Encoding                             latin9  European (ISO)           NaN         
tomweber-sas commented 4 years ago

look at your saslog too; I think you're just not referencing the right data set you created

tomweber-sas commented 4 years ago

this is easier to read:

>>> cars9
Libref  = WORK
Table   = cars9
Dsopts  = {'encoding': 'latin9'}
Results = Pandas
>>>
>>>> cars9.set_results('text')
>>> cars9.contents()

                                                           The SAS System                      17:45 Tuesday, September 15, 2020   1

                                                       The CONTENTS Procedure

              Data Set Name        WORK.CARS9                                               Observations          428
              Member Type          DATA                                                     Variables             16
              Engine               V9                                                       Indexes               0
              Created              09/15/2020 17:45:57                                      Observation Length    160
              Last Modified        09/15/2020 17:45:57                                      Deleted Observations  0
              Protection                                                                    Compressed            NO
              Data Set Type                                                                 Sorted                NO
              Label
              Data Representation  SOLARIS_X86_64, LINUX_X86_64, ALPHA_TRU64, LINUX_IA64
              Encoding             latin9  European (ISO)
biojerm commented 4 years ago

yeah that is likely, that i screwed something up I don't fully 'get' libref/table/saslib. Obviously I don't interact with SAS much directly since they appear to be pretty core concepts 😆

Here is the full set of commands I ran:

cars = self.session.sasdata('cars', 'sashelp')
df = cars.to_df()
df
cars9 = self.session.df2sd(df, 'cars9',outencoding='latin9')
self.session.saslib('cars9', path='/scratch/jlabarge')
cars9 = self.session.df2sd(df, table='cars_latin9', libref='cars9',outencoding='latin9')
tomweber-sas commented 4 years ago

ok, then run this (I changed results to text so you could read the contents output easier):

cars = self.session.sasdata('cars', 'sashelp', results='text')
cars.contents()

df = cars.to_df()
df

cars9 = self.session.df2sd(df, 'cars9',outencoding='latin9', results='text')  # this creates work.cars9  
cars9.contents() 

self.session.saslib('cars9', path='/scratch/jlabarge') # this is creating a new libref, rather than using work, that's fine

cars9 = self.session.df2sd(df, table='cars_latin9', libref='cars9',outencoding='latin9', results='text')  # this creates cars9.cars_latin9
cars9.contents() 
biojerm commented 4 years ago

thanks Here is my output

cars9.contents()

                                                           The SAS System                      14:58 Tuesday, Sept
ember 15, 2020   5

                                                       The CONTENTS Procedure

              Data Set Name        CARS9.CARS_LATIN9                                        Observations
0
              Member Type          DATA                                                     Variables
15
              Engine               V9                                                       Indexes
0
              Created              09/15/2020 14:03:48                                      Observation Length    152
              Last Modified        09/15/2020 14:03:48                                      Deleted Observations  0
              Protection                                                                    Compressed            NO
              Data Set Type                                                                 Sorted                NO
              Label                                                                                               
              Data Representation  SOLARIS_X86_64, LINUX_X86_64, ALPHA_TRU64, LINUX_IA64                          
              Encoding             utf-8  Unicode (UTF-8)                                                         

                                                 Engine/Host Dependent Information

                                 Data Set Page Size          65536
                                 Number of Data Set Pages    1
                                 First Data Page             1
                                 Max Obs per Page            430
                                 Obs in First Data Page      0
                                 Number of Data Set Repairs  0
                                 Filename                    /scratch/jlabarge/cars_latin9.sas7bdat
                                 Release Created             9.0401M2
                                 Host Created                Linux
                                 Inode Number                893078406
                                 Access Permission           rw-rw-r--
                                 Owner Name                  jlabarge
                                 File Size (bytes)           131072

still is showing UTF-8.

tomweber-sas commented 4 years ago

I'm not sure you're running this code. can you submit this after that last contents?

>>> print(sas.lastlog())
96                                                         The SAS System                          17:45 Tuesday, September 15, 2020

1538
1539       proc contents data=perm.'cars9'n (encoding="latin9" );run;
NOTE: Data file PERM.CARS9.DATA is in a format that is native to another host, or the file encoding does not match the session
      encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
      performance.

NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           0.03 seconds
      cpu time            0.02 seconds

NOTE: The PROCEDURE CONTENTS printed page 2.

1540
tomweber-sas commented 4 years ago

run that whole set of code and then print (sas.saslog()) then we can see what s really happening.

tomweber-sas commented 4 years ago

when I said I don't think you're running this code, I mean the source code with this enhancement in it; saspy outencoding branch of code. If we can see the log, at least, it should be apparent.

biojerm commented 4 years ago
  (Pdb) print (sas.saslog())
*** NameError: name 'sas' is not defined
(Pdb) print (self.session.saslog())
1                                                          The SAS System                          15:42 Tuesday, September 15, 2020

NOTE: Unable to open SASUSER.REGSTRY. WORK.REGSTRY will be opened instead.
NOTE: All registry changes will be lost at the end of the session.

WARNING: Unable to copy SASUSER registry to WORK registry. Because of this, you will not see registry customizations during this
         session.
NOTE: Copyright (c) 2002-2012 by SAS Institute Inc., Cary, NC, USA.
NOTE: SAS (r) Proprietary Software 9.4 (TS1M2 MBCS3170)
      Licensed to FRED HUTCHINSON CNCR RES CTR - SCHARP, Site 70230084.
NOTE: This session is executing on the Linux 4.4.104-18.44-default (LIN X64) platform.

NOTE: Updated analytical products:

      SAS/STAT 13.2
      SAS/IML 13.2

NOTE: Additional host information:

 Linux LIN X64 4.4.104-18.44-default #1 SMP Thu Jan 4 08:07:55 UTC 2018 (05a9de6) x86_64 openSUSE 42.2 (x86_64) VERSION = 42.2
      CODENAME = Malachite

NOTE: SAS initialization used:
      real time           0.02 seconds
      cpu time            0.03 seconds

NOTE: AUTOEXEC processing beginning; file is /usr/local/sas-9.4/SASFoundation/9.4/autoexec.sas.

NOTE: AUTOEXEC processing completed.

1    ;*';*";*/;
2    options svgtitle='svgtitle'; options validvarname=any validmemname=extend; ods graphics on;
3
4    ;*';*";*/;%put %upcase(e3969440a681a2408885998500000001);
E3969440A681A2408885998500000001
5    ;*';*";*/;
6    data _null_; length x $ 4096; file STDERR;
7                   x = resolve('%sysfunc(pathname(work))');  put 'WORKPATH=' x 'WORKPATHEND=';
8                   x = resolve('&SYSENCODING');              put 'ENCODING=' x 'ENCODINGEND=';
9                   x = resolve('&SYSVLONG4');                put 'SYSVLONG=' x 'SYSVLONGEND=';
10                  x = resolve('&SYSJOBID');                 put 'SYSJOBID=' x 'SYSJOBIDEND=';
11                  x = resolve('&SYSSCP');                     put 'SYSSCP=' x 'SYSSCPEND=';
12               run;
NOTE: The file STDERR is:
      Pipe command="<standard error>"
WORKPATH=/tmp/SAS_work4F980000AEA2_statsrv WORKPATHEND=
ENCODING=utf-8 ENCODINGEND=
SYSVLONG=9.04.01M2P07232014 SYSVLONGEND=
SYSJOBID=44706 SYSJOBIDEND=
SYSSCP=LIN X64 SYSSCPEND=

NOTE: 5 records were written to the file STDERR.
      The minimum record length was 25.
      The maximum record length was 55.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds

13
14
15   ;*';*";*/;%put %upcase(e3969440a681a2408885998500000002);
E3969440A681A2408885998500000002
16   ;*';*";*/;
17   data _null_; file STDERR; put %upcase('col0REG=');
NOTE: The file STDERR is:
      Pipe command="<standard error>"
COL0REG=

NOTE: 1 record was written to the file STDERR.
      The minimum record length was 8.
      The maximum record length was 8.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds

18                                  data _null_;
18 !                                             put %upcase('col0LOG=');run;
COL0LOG=
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

19
20   ;*';*";*/;%put %upcase(e3969440a681a2408885998500000003);
E3969440A681A2408885998500000003
21   ;*';*";*/;
22   libname outlib    '/scratch/jlabarge'  ;
NOTE: Libref OUTLIB was successfully assigned as follows:
      Engine:        V9
      Physical Name: /scratch/jlabarge
23
24   ;*';*";*/;%put %upcase(e3969440a681a2408885998500000004);
E3969440A681A2408885998500000004
25   ;*';*";*/;
26   data _null_; e = exist("sashelp.'cars'n");
27   v = exist("sashelp.'cars'n", 'VIEW');
28    if e or v then e = 1;
29   put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

30
31   ;*';*";*/;%put %upcase(e3969440a681a2408885998500000005);
E3969440A681A2408885998500000005
32   ;*';*";*/;
33   data _null_; e = exist("sashelp.'cars'n");
34   v = exist("sashelp.'cars'n", 'VIEW');
35    if e or v then e = 1;
36   put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

37
38   ;*';*";*/;%put %upcase(e3969440a681a2408885998500000006);
E3969440A681A2408885998500000006
39   ;*';*";*/;
40   data _null_; e = exist("sashelp.'cars'n");
41   v = exist("sashelp.'cars'n", 'VIEW');
42    if e or v then e = 1;
43   put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds

44
45   ;*';*";*/;%put %upcase(e3969440a681a2408885998500000007);
E3969440A681A2408885998500000007
46   ;*';*";*/;
47   data sasdata2dataframe / view=sasdata2dataframe; set sashelp.'cars'n ;run;
NOTE: DATA STEP view saved on file WORK.SASDATA2DATAFRAME.
NOTE: A stored DATA STEP view cannot run under a different operating system.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds

48   data _null_; file STDERR;d = open('sasdata2dataframe');
49   lrecl = attrn(d, 'LRECL'); nvars = attrn(d, 'NVARS');
50   lr='LRECL='; vn='VARNUMS='; vl='VARLIST='; vt='VARTYPE=';
51   put lr lrecl; put vn nvars; put vl;
52   do i = 1 to nvars; var = varname(d, i); put var; end;
53   put vt;
54   do i = 1 to nvars; var = vartype(d, i); put var; end;
55   run;
NOTE: The file STDERR is:
      Pipe command="<standard error>"
LRECL= 152
VARNUMS= 15
VARLIST=
Make
Model
Type
Origin
DriveTrain
MSRP
Invoice
EngineSize
Cylinders
Horsepower
MPG_City
MPG_Highway
Weight
Wheelbase
Length
VARTYPE=
C
C
C
C
C
N
N
N
N
N
N
N
N
N
N

NOTE: 34 records were written to the file STDERR.
      The minimum record length was 1.
      The maximum record length was 11.
NOTE: View WORK.SASDATA2DATAFRAME.VIEW used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

56
57   ;*';*";*/;%put %upcase(e3969440a681a2408885998500000008);
E3969440A681A2408885998500000008
58   ;*';*";*/;
59   data work._n_u_l_l_;output;run;
NOTE: The data set WORK._N_U_L_L_ has 1 observations and 0 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

60   data _null_; file STDERR; set work._n_u_l_l_ sashelp.'cars'n (obs=0 );put 'FMT_CATS=';
61   _tom = vformatn('Make'n);put _tom;
62   _tom = vformatn('Model'n);put _tom;
63   _tom = vformatn('Type'n);put _tom;
64   _tom = vformatn('Origin'n);put _tom;
65   _tom = vformatn('DriveTrain'n);put _tom;
66   _tom = vformatn('MSRP'n);put _tom;
67   _tom = vformatn('Invoice'n);put _tom;
68   _tom = vformatn('EngineSize'n);put _tom;
69   _tom = vformatn('Cylinders'n);put _tom;
70   _tom = vformatn('Horsepower'n);put _tom;
71   _tom = vformatn('MPG_City'n);put _tom;
72   _tom = vformatn('MPG_Highway'n);put _tom;
73   _tom = vformatn('Weight'n);put _tom;
74   _tom = vformatn('Wheelbase'n);put _tom;
75   _tom = vformatn('Length'n);put _tom;
76   run;
NOTE: The file STDERR is:
      Pipe command="<standard error>"
FMT_CATS=
$
$
$
$
$
DOLLAR
DOLLAR
BEST
BEST
BEST
BEST
BEST
BEST
BEST
BEST

NOTE: 16 records were written to the file STDERR.
      The minimum record length was 1.
      The maximum record length was 9.
NOTE: There were 1 observations read from the data set WORK._N_U_L_L_.
NOTE: There were 0 observations read from the data set SASHELP.CARS.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds

77   proc delete data=work._n_u_l_l_;run;
NOTE: Deleting WORK._N_U_L_L_ (memtype=DATA).
NOTE: PROCEDURE DELETE used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

78
79   ;*';*";*/;%put %upcase(e3969440a681a2408885998500000009);
E3969440A681A2408885998500000009
80   filename sock socket ':37831' recfm=S  lrecl=4096;
81   data _null_; set sashelp.'cars'n ;
82   format 'MSRP'n best32.; format 'Invoice'n best32.; format 'EngineSize'n best32.; format 'Cylinders'n best32.; format
82 ! 'Horsepower'n best32.; format 'MPG_City'n best32.;
83   format 'MPG_Highway'n best32.; format 'Weight'n best32.; format 'Wheelbase'n best32.; format 'Length'n best32.;
84   file sock;
85   'Make'n = translate('Make'n, '2020'x, '0102'x);
86   'Model'n = translate('Model'n, '2020'x, '0102'x); 'Type'n = translate('Type'n, '2020'x, '0102'x); 'Origin'n =
86 ! translate('Origin'n, '2020'x, '0102'x); 'DriveTrain'n = translate('DriveTrain'n, '2020'x, '0102'x);
87   put 'Make'n '02'x;
88   put 'Model'n '02'x; put 'Type'n '02'x; put 'Origin'n '02'x; put 'DriveTrain'n '02'x; put 'MSRP'n '02'x; put 'Invoice'n '02'x;
88 ! put 'EngineSize'n '02'x; put 'Cylinders'n '02'x; put 'Horsepower'n '02'x; put 'MPG_City'n '02'x;
89   put 'MPG_Highway'n '02'x; put 'Weight'n '02'x; put 'Wheelbase'n '02'x; put 'Length'n '01'x; run;
NOTE: The file SOCK is:
      Local Host Name=statsrv,
      Local Host IP addr=127.0.1.1,
      Peer Hostname Name=statsrv.pc.scharp.org,
      Peer IP addr=127.0.1.1,Peer Name=N/A,
      Peer Portno=37831,Lrecl=4096,Recfm=Stream

NOTE: 6420 records were written to the file SOCK.
      The minimum record length was 3.
      The maximum record length was 41.
NOTE: There were 428 observations read from the data set SASHELP.CARS.
NOTE: DATA statement used (Total process time):
      real time           0.04 seconds
      cpu time            0.05 seconds

90   ;*';*";*/;
91
92
93   ;*';*";*/;%put %upcase(e3969440a681a2408885998500000010);
E3969440A681A2408885998500000010
94   data 'cars9'n;
95   (encoding="latin9");
     -
     180
ERROR 180-322: Statement is not valid or it is used out of proper order.

96   length 'Make'n $13 'Model'n $39 'Type'n $6 'Origin'n $6 'DriveTrain'n $5 'MSRP'n 8 'Invoice'n 8 'EngineSize'n 8 'Cylinders'n 8
96 ! 'Horsepower'n 8 'MPG_City'n 8 'MPG_Highway'n 8 'Weight'n 8 'Wheelbase'n 8 'Length'n 8;
97   infile datalines delimiter='03'x  STOPOVER;
98   input @;
99   if _infile_ = '' then delete;
100  input 'Make'n 'Model'n 'Type'n 'Origin'n 'DriveTrain'n 'MSRP'n 'Invoice'n 'EngineSize'n 'Cylinders'n 'Horsepower'n 'MPG_City'n
100! 'MPG_Highway'n 'Weight'n 'Wheelbase'n 'Length'n ;
101   'Make'n = translate('Make'n, '0A'x, '01'x);
102   'Make'n = translate('Make'n, '0D'x, '02'x );
103   'Model'n = translate('Model'n, '0A'x, '01'x);
104   'Model'n = translate('Model'n, '0D'x, '02'x );
105   'Type'n = translate('Type'n, '0A'x, '01'x);
106   'Type'n = translate('Type'n, '0D'x, '02'x );
107   'Origin'n = translate('Origin'n, '0A'x, '01'x);
108   'Origin'n = translate('Origin'n, '0D'x, '02'x );
109   'DriveTrain'n = translate('DriveTrain'n, '0A'x, '01'x);
110   'DriveTrain'n = translate('DriveTrain'n, '0D'x, '02'x );
111  ;
112  datalines4;
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.CARS9 may be incomplete.  When this step was stopped there were 0 observations and 15 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds

541  ;;;;
542  ;*';*";*/;
543  run;
544
545  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000011);
E3969440A681A2408885998500000011
546  ;*';*";*/;
547  data _null_; e = exist("'cars9'n");
548  v = exist("'cars9'n", 'VIEW');
549   if e or v then e = 1;
550  put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

551
552  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000012);
E3969440A681A2408885998500000012
553  ;*';*";*/;
554  data _null_; e = exist("user.'cars9'n");
555  v = exist("user.'cars9'n", 'VIEW');
556   if e or v then e = 1;
557  put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=0 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

558
559  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000013);
E3969440A681A2408885998500000013
560  ;*';*";*/;
561  libname cars9    '/scratch/jlabarge'  ;
NOTE: Libref CARS9 refers to the same physical library as OUTLIB.
NOTE: Libref CARS9 was successfully assigned as follows:
      Engine:        V9
      Physical Name: /scratch/jlabarge
562
563  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000014);
E3969440A681A2408885998500000014
564  ;*';*";*/;
565
566          data _null_; retain libref; retain cobs 1;
567             set sashelp.vlibnam end=last;
568             if cobs EQ 1 then
569                put "LIBREFSSTART=";
570             cobs = 2;
571             if libref NE libname then
572                put  %upcase("lib=") libname  %upcase('libEND=');
573             libref = libname;
574             if last then
575                put "LIBREFSEND=";
576          run;
LIBREFSSTART=
LIB=OUTLIB LIBEND=
LIB=CARS9 LIBEND=
LIB=SASHELP LIBEND=
LIB=MAPS LIBEND=
LIB=MAPSSAS LIBEND=
LIB=MAPSGFK LIBEND=
LIB=SASUSER LIBEND=
LIB=WORK LIBEND=
LIBREFSEND=
NOTE: There were 55 observations read from the data set SASHELP.VLIBNAM.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

577
578
579  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000015);
E3969440A681A2408885998500000015
580  data cars9.'cars_latin9'n;
581  (encoding="latin9");
     -
     180
ERROR 180-322: Statement is not valid or it is used out of proper order.

582  length 'Make'n $13 'Model'n $39 'Type'n $6 'Origin'n $6 'DriveTrain'n $5 'MSRP'n 8 'Invoice'n 8 'EngineSize'n 8 'Cylinders'n 8
582! 'Horsepower'n 8 'MPG_City'n 8 'MPG_Highway'n 8 'Weight'n 8 'Wheelbase'n 8 'Length'n 8;
583  infile datalines delimiter='03'x  STOPOVER;
584  input @;
585  if _infile_ = '' then delete;
586  input 'Make'n 'Model'n 'Type'n 'Origin'n 'DriveTrain'n 'MSRP'n 'Invoice'n 'EngineSize'n 'Cylinders'n 'Horsepower'n 'MPG_City'n
586! 'MPG_Highway'n 'Weight'n 'Wheelbase'n 'Length'n ;
587   'Make'n = translate('Make'n, '0A'x, '01'x);
588   'Make'n = translate('Make'n, '0D'x, '02'x );
589   'Model'n = translate('Model'n, '0A'x, '01'x);
590   'Model'n = translate('Model'n, '0D'x, '02'x );
591   'Type'n = translate('Type'n, '0A'x, '01'x);
592   'Type'n = translate('Type'n, '0D'x, '02'x );
593   'Origin'n = translate('Origin'n, '0A'x, '01'x);
594   'Origin'n = translate('Origin'n, '0D'x, '02'x );
595   'DriveTrain'n = translate('DriveTrain'n, '0A'x, '01'x);
596   'DriveTrain'n = translate('DriveTrain'n, '0D'x, '02'x );
597  ;
598  datalines4;
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set CARS9.CARS_LATIN9 may be incomplete.  When this step was stopped there were 0 observations and 15 variables.
WARNING: Data set CARS9.CARS_LATIN9 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.02 seconds
      cpu time            0.01 seconds

1027  ;;;;
1028  ;*';*";*/;
1029  run;
1030
1031  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000016);
E3969440A681A2408885998500000016
1032  ;*';*";*/;
1033  data _null_; e = exist("cars9.'cars_latin9'n");
1034  v = exist("cars9.'cars_latin9'n", 'VIEW');
1035   if e or v then e = 1;
1036  put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds

1037
1038  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000017);
E3969440A681A2408885998500000017
1039  ;*';*";*/;
1040  data _null_; e = exist("cars9.'cars_latin9'n");
1041  v = exist("cars9.'cars_latin9'n", 'VIEW');
1042   if e or v then e = 1;
1043  put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds

1044
1045  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000018);
E3969440A681A2408885998500000018
1046  ;*';*";*/;
1047  proc contents data=cars9.'cars_latin9'n (encoding="latin9" );run;
NOTE: Data file CARS9.CARS_LATIN9.DATA is in a format that is native to another host, or the file encoding does not match the
      session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
      performance.
NOTE: The PROCEDURE CONTENTS printed page 1.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds

1048
1049  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000019);
E3969440A681A2408885998500000019
1050  ;*';*";*/;
1051  data _null_; e = exist("cars9.'cars_latin9'n");
1052  v = exist("cars9.'cars_latin9'n", 'VIEW');
1053   if e or v then e = 1;
1054  put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds

1055
1056  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000020);
E3969440A681A2408885998500000020
1057  ;*';*";*/;
1058  proc contents data=cars9.'cars_latin9'n (encoding="latin9" );run;
NOTE: Data file CARS9.CARS_LATIN9.DATA is in a format that is native to another host, or the file encoding does not match the
      session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
      performance.
NOTE: The PROCEDURE CONTENTS printed page 2.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

1059
1060  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000021);
E3969440A681A2408885998500000021
1061  ;*';*";*/;
1062  data _null_; e = exist("cars9.'cars_latin9'n");
1063  v = exist("cars9.'cars_latin9'n", 'VIEW');
1064   if e or v then e = 1;
1065  put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds

1066
1067  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000022);
E3969440A681A2408885998500000022
1068  ;*';*";*/;
1069  proc contents data=cars9.'cars_latin9'n (encoding="latin9" );run;
NOTE: Data file CARS9.CARS_LATIN9.DATA is in a format that is native to another host, or the file encoding does not match the
      session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
      performance.
NOTE: The PROCEDURE CONTENTS printed page 3.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds

1070
1071  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000023);
E3969440A681A2408885998500000023
1072  ;*';*";*/;
1073  data _null_; e = exist("cars9.'cars_latin9'n");
1074  v = exist("cars9.'cars_latin9'n", 'VIEW');
1075   if e or v then e = 1;
1076  put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

1077
1078  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000024);
E3969440A681A2408885998500000024
1079  ;*';*";*/;
1080  proc contents data=cars9.'cars_latin9'n (encoding="latin9" );run;
NOTE: Data file CARS9.CARS_LATIN9.DATA is in a format that is native to another host, or the file encoding does not match the
      session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
      performance.
NOTE: The PROCEDURE CONTENTS printed page 4.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           0.01 seconds
      cpu time            0.02 seconds

1081
1082  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000025);
E3969440A681A2408885998500000025
1083  ;*';*";*/;
1084  data _null_; e = exist("cars9.'cars_latin9'n");
1085  v = exist("cars9.'cars_latin9'n", 'VIEW');
1086   if e or v then e = 1;
1087  put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

1088
1089  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000026);
E3969440A681A2408885998500000026
1090  ;*';*";*/;
1091  proc contents data=cars9.'cars_latin9'n (encoding="latin9" );run;
NOTE: Data file CARS9.CARS_LATIN9.DATA is in a format that is native to another host, or the file encoding does not match the
      session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
      performance.
NOTE: The PROCEDURE CONTENTS printed page 5.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

1092
1093  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000027);
E3969440A681A2408885998500000027
1094  ;*';*";*/;
1095  data _null_; e = exist("cars9.'cars_latin9'n");
1096  v = exist("cars9.'cars_latin9'n", 'VIEW');
1097   if e or v then e = 1;
1098  put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

1099
1100  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000028);
E3969440A681A2408885998500000028
1101  ;*';*";*/;
1102  proc contents data=cars9.'cars_latin9'n (encoding="latin9" );run;
NOTE: Data file CARS9.CARS_LATIN9.DATA is in a format that is native to another host, or the file encoding does not match the
      session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
      performance.
NOTE: The PROCEDURE CONTENTS printed page 6.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

1103
1104  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000029);
E3969440A681A2408885998500000029
1105  ;*';*";*/;
1106  data _null_; e = exist("cars9.'cars_latin9'n");
1107  v = exist("cars9.'cars_latin9'n", 'VIEW');
1108   if e or v then e = 1;
1109  put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

1110
1111  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000030);
E3969440A681A2408885998500000030
1112  ;*';*";*/;
1113  proc contents data=cars9.'cars_latin9'n (encoding="latin9" );run;
NOTE: Data file CARS9.CARS_LATIN9.DATA is in a format that is native to another host, or the file encoding does not match the
      session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
      performance.
NOTE: The PROCEDURE CONTENTS printed page 7.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

1114
1115  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000031);
E3969440A681A2408885998500000031
1116  ;*';*";*/;
1117  data _null_; e = exist("cars9.'cars_latin9'n");
1118  v = exist("cars9.'cars_latin9'n", 'VIEW');
1119   if e or v then e = 1;
1120  put 'TABLE_EXISTS=' e 'TAB_EXTEND=';run;
TABLE_EXISTS=1 TAB_EXTEND=
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

1121
1122  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000032);
E3969440A681A2408885998500000032
1123  ;*';*";*/;
1124  proc contents data=cars9.'cars_latin9'n (encoding="latin9" );run;
NOTE: Data file CARS9.CARS_LATIN9.DATA is in a format that is native to another host, or the file encoding does not match the
      session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce
      performance.
NOTE: The PROCEDURE CONTENTS printed page 8.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

1125
1126  ;*';*";*/;%put %upcase(e3969440a681a2408885998500000033);
E3969440A681A2408885998500000033
tomweber-sas commented 4 years ago

Alright! You found a bug. Firrst, you are running the new code, so that's good. I missed one little thing in the stdio access method; I was running iom. For got to delete the ';\n' in the line before the new code that added the option. That's why you have the error above: (fix is pushed, pull this change and you should be good to go!)

94 data 'cars9'n; 95 (encoding="latin9");

 180

ERROR 180-322: Statement is not valid or it is used out of proper order.

diff --git a/saspy/sasiostdio.py b/saspy/sasiostdio.py
index fdb98e2..4205a30 100644
--- a/saspy/sasiostdio.py
+++ b/saspy/sasiostdio.py
@@ -1506,7 +1506,7 @@ Will use HTML5 for this SASsession.""")
       code = "data "
       if len(libref):
          code += libref+"."
-      code += "'"+table.strip()+"'n;\n"
+      code += "'"+table.strip()+"'n"
       if len(outencoding):
          code += '(encoding="'+outencoding+'");\n'
       else:
biojerm commented 4 years ago

excellent I will give it a try.

biojerm commented 4 years ago

works like a charm!Thanks Tom as always your help, patience and responsiveness is greatly appreciated


                                                   The SAS System                            16:18 Tuesday, September 15, 2020   1

                                                            The CONTENTS Procedure

                    Data Set Name        JLABARGE.CARS3_LATIN9                                    Observations          428
                    Member Type          DATA                                                     Variables             15
                    Engine               V9                                                       Indexes               0
                    Created              09/15/2020 16:18:23                                      Observation Length    152
                    Last Modified        09/15/2020 16:18:23                                      Deleted Observations  0
                    Protection                                                                    Compressed            NO
                    Data Set Type                                                                 Sorted                NO
                    Label
                    Data Representation  SOLARIS_X86_64, LINUX_X86_64, ALPHA_TRU64, LINUX_IA64
                    Encoding             latin9  European (ISO)
biojerm commented 4 years ago

I am happy and am ok with closing the issue, do you know when you might merge this into main?

tomweber-sas commented 4 years ago

Sweet! Yes, I've run regressions and, now, also the new code with all 3 access methods. I'll merge it in tomorrow. I can build a new release too, as I expect you really want it in a pypi release, not just in the repo. I have a few other things at main that can go into a new release. After I do that tomorrow, I'll post back here and we can close this then. Have a great evening! More hockey to watch, haha😎

tomweber-sas commented 4 years ago

Ok, Jeremy, this is merged, pushed, built and out on Pypi as V3.5.1, the current production version. I'll close this now, and let me know when you need something else!

Thanks, Tom

AnandReddy23 commented 2 years ago

Hi Tom, I am working on saspy package and encountered below error. Could you help me with resolution please? My underlying sas data set is spds format. Code : import saspy import pandas as pd from datetime import datetime sas = saspy.SASsession() sas.saslib('SS','SPDE','hdfs path which stores spds data sets','hdfshost=default') db=sas.sasdata('table','SS').to_df() Error : 'utf-8' codec can't decode byte 0xc3 in position 254179: invalid continuation byte sasdata2dataframe was interupted. Trying to return the saslog instead of a data frame.

sas Access Method = SSH SAS Config name = ssh SAS Config file = /python3.6/site-packages/saspy/sascfg.py WORK Path = <> SAS Version = 9.04.01M7P08052020 SASPy Version = 3.7.2 Teach me SAS = False Batch = False Results = Pandas SAS Session Encoding = utf-8 Python Encoding value = utf-8 SAS process Pid value = 1101

tomweber-sas commented 2 years ago

Hey @AnandReddy23, this is an old closed issue. Can you open a new issue for this problem you're having. Thanks! Tom