sassoftware / saspy

A Python interface module to the SAS System. It works with Linux, Windows, and Mainframe SAS as well as with SAS in Viya.
https://sassoftware.github.io/saspy
Other
374 stars 149 forks source link

where can I find pid when I start my ? #144

Closed PistonIntJack closed 5 years ago

PistonIntJack commented 6 years ago

Hello,I have two questions, thank you.

  1. When I start my saspy.SASsession(), I find that my computer build two threads(Java and SAS). I can find that my Java pid . But I can not find the SAS pid. Where can I find? I must find sas pid to do other thing.

"java pid" code sas = saspy.SASsession() java_pid = sas._io.pid.pid

  1. I have two Solid State Drives and want to start two saspy.SASsession() with two cfg files. Can I repair my sasv9.cfg when I start my saspy.SASsession()?

My old code : sas.exe -CONFIG sasv9.cfg -SYSIN test.sas sas.exe -CONFIG sasv91.cfg -SYSIN test1.sas

tomweber-sas commented 6 years ago

@PistonIntJack 1) I don't have the pid for the SAS process in saspy. I start the Java process, so I have that, which you see. But the java IOM Client code starts the SAS process and I don't have access to what the SAS pid is.

2) You're just trying to start 2 sessions with different config files, concurrently? You can do that by having 2 different configuration definitions in your sascfg_personal.py file and specifying each on the 2 SASsessions(cfgname='...') methods. The trick to this though, with the IOM access method, requires you use the javaparms key in your configuration definitions to specify the whole start up command and parameters. IOM uses the windows registry to find the command and parameters to start the local session. You can look at what that registry key has in it to see what it's using. That key is: key=HKEY_CLASSES_ROOTCLSID{440196D4-90F0-11D0-9F41-00A024BB830C}LocalServer32 Then you can code it up yourself in the javaparms key, specifying your different config files like below. Then just start each session with it's own config. The following shows 2 config definitions from my config file, and the SASsession() methods that start each of them with the different SAS config files.

cfg1     = {'java'      : 'java',
            'encoding'  : 'cp1252',
            'classpath' : cpl,
            'javaparms' : ['-Dcom.sas.iom.orb.brg.zeroConfigWorkspaceServer.sascmd=C:\PROGRA~1\SASHome\SASFOU~1\9.4\SAS.EXE -config "C:\PROGRA~1\SASHome\SASFOU~1\9.4\sasv9.cfg" -objectserver -nologo -noterminal -noprngetlist']
            }

cfg2     = {'java'      : 'java',
            'encoding'  : 'cp1252',
            'classpath' : cpl,
            'javaparms' : ['-Dcom.sas.iom.orb.brg.zeroConfigWorkspaceServer.sascmd=C:\PROGRA~1\SASHome\SASFOU~1\9.4\SAS.EXE -config "C:\\Users\\sastpw\\sasv9V2.cfg" -objectserver -nologo -noterminal -noprngetlist']
            }

sas1 = saspy.SASsession(cfgname='cfg1')
sas2 = saspy.SASsession(cfgname='cfg2')

Tom

PistonIntJack commented 6 years ago

@tomweber-sas ,Thank you Tom. Sorry my email can not remind me of your answer. Thank you very much.

PistonIntJack commented 6 years ago

@tomweber-sas , Hi, I find two errors.

  1. When I modify my config files, strating my SASsession will be an error

cfg1 = {'java' : 'java', 'encoding' : 'utf8', 'classpath' : cpW, 'javaparms' : ['-Dcom.sas.iom.orb.brg.zeroConfigWorkspaceServer.sascmd="C:\Program Files\SASHome\SASFoundation\9.4\SAS.EXE" -config "C:\Program Files\SASHome\SASFoundation\9.4\nls\u8\sasv9.cfg" -objectserver -nologo -noterminal -noprngetlist'] }

or

cfg1 = {'java' : 'java', 'encoding' : 'utf8', 'classpath' : cpL, 'javaparms' : ['-Dcom.sas.iom.orb.brg.zeroConfigWorkspaceServer.sascmd="C:\Program Files\SASHome\SASFoundation\9.4\SAS.EXE" -config "C:\Program Files\SASHome\SASFoundation\9.4\nls\u8\sasv9.cfg" -objectserver -nologo -noterminal -noprngetlist'] }

menu saveimg savepath20180709193933

2.The data is large (30000 records). it will be encoding error when sasdata transform dataframe. The following figure, trimid is the unique key.

menu saveimg savepath20180708190125

tomweber-sas commented 6 years ago

@PistonIntJack For the first issue, simply add your new configuration definition name(s) to the 'SAS_config_names' list. If you only want to use one of the various definitions you have coded, just put that one in the list by itself and you won't be prompted for which one to use.:

SAS_config_names=['default', 'cfg1', 'cfg2']
or
SAS_config_names=['cfg1']

For the second, what encoding is SAS running with? If you just submit your 'sas' object, it will print out that information. Can you also try using the CSV version of to_df() method: to_df_CSV(). Is the data in your SAS dataset all in the encoding SAS is running in?

Thanks, Tom

PistonIntJack commented 6 years ago

@tomweber-sas
Thank you Tom. The first problem have been solved after I use your method. In the second problem, encoding is utf8. If I use to_df_CSV(), there will not encoding error and have a warning.
But I find some data in the wrong filed. for example: field1 field2 0 a d 1 b d 2 c d I should see the a,b,c in field1. But I see the d in field1.(my data is Chinese,so I give a virtual example)

menu saveimg savepath20180710142536 menu saveimg savepath20180710102348 menu saveimg savepath20180710102407

tomweber-sas commented 6 years ago

@PistonIntJack There were some fixes having to do with transcoding with the CSV method in more recent versions. I wonder if you would see the same behavior in the current version, 2.2.6? I would be curious to see the what the contents() method shows so I can see what the columns and data types are. Given the data is in the wrong columns, I suspect the warning is a secondary issue, not the root problem. Can you run the following to see what the dataset looks like, and can you try this with 2.2.6?

x = sas.sasdata('spec', 'useddata', results='html')
x.contents()

Thanks, Tom

PistonIntJack commented 6 years ago

@tomweber-sas I run the code in the saspy 2.2.6. The encoding error has been still exist. "drive_mode" is string.

menu saveimg savepath20180710223802 menu saveimg savepath20180710223832

Another problem, if x is sasdata(in your code), can I modify x's dsopts?

tomweber-sas commented 6 years ago

Yes, 'x' is a SASdata object, so all methods and attribute for SASdata exist and apply to 'x' in this example.

tomweber-sas commented 6 years ago

Do you see the same behavior for both to_df() and to_df_CSV() in 2.2.6 as you did in 2.2.1?

PistonIntJack commented 6 years ago

@tomweber-sas

  1. about x, I want to modify dsopts? but I can not successfully modify. like sas.sasdata(x, "useddata", dsopts={"keep": "TrimId id ProvId"}), I can get a new sasdata but can I modify x itself

  2. In spec data. to_df_CSV is right, but to_df is wrong. menu saveimg savepath20180711105641

in the new data trimall. I find that when the field number is much. Code will have an error. a3, a2 is right. but a1 is wrong

menu saveimg savepath20180711150511 menu saveimg savepath20180711150539

tomweber-sas commented 6 years ago

@PistonIntJack for the first question about modifying dsopts, yes, you can modify it. There's nothing special about it, it's just basic python language. The SASdata object has methods and attributes. Dsopts is an attribute which is a dictionary. So you can simply assign an appropriate dictionary to that attribute of the object, like this:

cars = sas.sasdata('cars', libref='sashelp', dsopts={"keep": "horsepower"})
#is the same result as the following
cars = sas.sasdata('cars', libref='sashelp')
cars.dsopts={"keep": "horsepower"}

Here's an example showing this

tom64-3> python3.5
Python 3.5.1 (default, Jan 19 2016, 21:32:20)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-16)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import saspy
sas >>> sas = saspy.SASsession()
Please enter the name of the SAS Config you wish to run. Available Configs are: ['default', 'SASgrid', 'http', 'httptest', 'ssh', 'sshtun', 'httpfred', 'grid', 'tdi', 'tdilat', 'iomj', 'iomc', 'iomjwin', 'winiomj', 'winiomjwin', 'winlocal', 'gridiom', 'wingridiom', 'zos', 'zos2', 'winzos', 'winzos2', 'sshtest', 'sshloc', 'sdssas', 'saskr', 'vb010', 'vb015', 'pune', 'notpune', 'iomkr', 'httpviya'] sdssas
SAS Connection established. Subprocess id is 7213

>>> cars = sas.sasdata('cars', libref='sashelp')
>>> cars
Libref  = sashelp
Table   = cars
Dsopts  = {}
Results = Pandas

>>> cars.dsopts
{}
>>> cars.head()
    Make           Model   Type Origin DriveTrain   MSRP  Invoice  EngineSize  \
0  Acura             MDX    SUV   Asia        All  36945    33337         3.5
1  Acura  RSX Type S 2dr  Sedan   Asia      Front  23820    21761         2.0
2  Acura         TSX 4dr  Sedan   Asia      Front  26990    24647         2.4
3  Acura          TL 4dr  Sedan   Asia      Front  33195    30299         3.2
4  Acura      3.5 RL 4dr  Sedan   Asia      Front  43755    39014         3.5

   Cylinders  Horsepower  MPG_City  MPG_Highway  Weight  Wheelbase  Length
0          6         265        17           23    4451        106     189
1          4         200        24           31    2778        101     172
2          4         200        22           29    3230        105     183
3          6         270        20           28    3575        108     186
4          6         225        18           24    3880        115     197
>>> cars.dsopts={"keep": "horsepower"}
>>> cars
Libref  = sashelp
Table   = cars
Dsopts  = {'keep': 'horsepower'}
Results = Pandas

>>> cars.dsopts
{'keep': 'horsepower'}
>>> cars.head()
   Horsepower
0         265
1         200
2         200
3         270
4         225
>>>
tomweber-sas commented 6 years ago

Back to the trasncoding issues. Can you show me the saslog for these cases you are running? That may help me see what may be happening. IF we can narrow it down to certain values in certain rows, perhaps I can reproduce it here and see what's actually causing the problem.

PistonIntJack commented 6 years ago

Thank you Tom . This is the log. I find that when I use little field. The code is right (every field) . But when I use the all fields, code is the wrong.

TypeError Traceback (most recent call last) pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._convert_tokens()

TypeError: Cannot cast array from dtype('O') to dtype('float64') according to the rule 'safe'

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)

in () 1 print(sas.sasdata("a3", "work").to_df_CSV().head()) 2 print(sas.sasdata("a2", "work").to_df_CSV().head()) ----> 3 print(sas.sasdata("a1", "work").to_df_CSV().head()) ~\Anaconda3\lib\site-packages\saspy\sasbase.py in to_df_CSV(self, tempfile, tempkeep, **kwargs) 1944 :rtype: 'pd.DataFrame' 1945 """ -> 1946 return self.to_df(method='CSV', tempfile=tempfile, tempkeep=tempkeep, **kwargs) 1947 1948 def heatmap(self, x: str, y: str, options: str = '', title: str = '', ~\Anaconda3\lib\site-packages\saspy\sasbase.py in to_df(self, method, **kwargs) 1932 return None 1933 else: -> 1934 return self.sas.sasdata2dataframe(self.table, self.libref, self.dsopts, method, **kwargs) 1935 1936 def to_df_CSV(self, tempfile: str=None, tempkeep: bool=False, **kwargs) -> 'pd.DataFrame': ~\Anaconda3\lib\site-packages\saspy\sasbase.py in sasdata2dataframe(self, table, libref, dsopts, method, **kwargs) 788 return None 789 else: --> 790 return self._io.sasdata2dataframe(table, libref, dsopts, method=method, **kwargs) 791 792 def _dsopts(self, dsopts): ~\Anaconda3\lib\site-packages\saspy\sasioiom.py in sasdata2dataframe(self, table, libref, dsopts, rowsep, colsep, **kwargs) 1151 method = kwargs.pop('method', None) 1152 if method and method.lower() == 'csv': -> 1153 return self.sasdata2dataframeCSV(table, libref, dsopts, **kwargs) 1154 1155 logf = '' ~\Anaconda3\lib\site-packages\saspy\sasioiom.py in sasdata2dataframeCSV(self, table, libref, dsopts, tempfile, tempkeep, **kwargs) 1552 break 1553 -> 1554 df = pd.read_csv(tmpcsv, index_col=False, engine='c', dtype=dts, **kwargs) 1555 1556 if tmpdir: ~\Anaconda3\lib\site-packages\pandas\io\parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skipfooter, skip_footer, doublequote, delim_whitespace, as_recarray, compact_ints, use_unsigned, low_memory, buffer_lines, memory_map, float_precision) 707 skip_blank_lines=skip_blank_lines) 708 --> 709 return _read(filepath_or_buffer, kwds) 710 711 parser_f.__name__ = name ~\Anaconda3\lib\site-packages\pandas\io\parsers.py in _read(filepath_or_buffer, kwds) 453 454 try: --> 455 data = parser.read(nrows) 456 finally: 457 parser.close() ~\Anaconda3\lib\site-packages\pandas\io\parsers.py in read(self, nrows) 1067 raise ValueError('skipfooter not supported for iteration') 1068 -> 1069 ret = self._engine.read(nrows) 1070 1071 if self.options.get('as_recarray'): ~\Anaconda3\lib\site-packages\pandas\io\parsers.py in read(self, nrows) 1837 def read(self, nrows=None): 1838 try: -> 1839 data = self._reader.read(nrows) 1840 except StopIteration: 1841 if self._first_chunk: pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.read() pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_low_memory() pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_rows() pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._convert_column_data() pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._convert_tokens() ValueError: could not convert string to float: '无'
tomweber-sas commented 6 years ago

I'm sorry, I'm looking for the SAS log. If you start a fresh session, and run these, then you can submit

print(sas.saslog())

after to get the SASLOG which I would like to see. You can post it here or just email it to me directly if you like. Thanks! Tom

FriedEgg commented 6 years ago

It's a bit off-topic at this point in this thread but, you can get the PID of the SAS process by resolving the SAS Automatic Macro Variable SYSJOBID. You can get this value using the saspy.SASsession.symget method

PistonIntJack commented 6 years ago
  1. @tomweber-sas , hi, I find my log is Chinese. When I modify my config, I can see the English saslog in SAS. But when I use the python, saslog is still Chinese. I send the chinese log to your email, Can you read?
    May I change my encoding to lantin1, this is English saslog?
  2. @FriedEgg , Thank you for your help. I can find the saspid when use your method.
tomweber-sas commented 6 years ago

@PistonIntJack Yes, go ahead and send it as is. And, if you run in an English encoding, does the behavior change? Can send both logs if you can run that way too. Thanks, Tom @FriedEgg , cool, that's for that help, as always :)

tomweber-sas commented 6 years ago

BTW, can I see both to_df() and to_df_CSV() cases in the logs. Thanks!

PistonIntJack commented 6 years ago

@tomweber-sas ,Hi Tom. I try change my data utf8 to latin1, or change my saslog in English (log in SAS is English, but in Python is English). But it still report error. I will try to change in other method. Sorry

tomweber-sas commented 6 years ago

@PistonIntJack Just trying to get caught up. I haven't received any saslogs. Have you send any to me? I'm not really sure what state this is in regarding to_df() and to_df_CSV() with your data.

I've been running tests here w/ SAS running in Chinese and moving data that contains Chinese over using both methods. Our NLS division has some of the SASHELP data sets translated to Chinese in both Simplifies Chinese and in utf8 encodings. I'm not running into any errors. I've done the same with SAS and the data in utf8 also, with no issues either.

I'm attaching an html of a notebook showing some of this (remove the .txt extension to view it). Can you run this same sequence and send it to me; change the config to what you're using, change the path for the libname and the table, of course, but those same cells.

Thanks, Tom

NLS_EUC_CN_1.html.txt

tomweber-sas commented 6 years ago

@PistonIntJack I have found some bugs in my code which I've fixed in a new branch: nls2. I can't be sure if they will fix what you're seeing, but I believe it can only be better. The test case I showed in the previous post, still works, but it didn't show the problems I found and fixed either. So I'm hopeful you will see much improvement with the new branch and the data you're trying to access.

If you need any help downloading and installing that branch, just let me know.

Thanks! Tom

ghost commented 6 years ago

@PistonIntJack 你好,才发现也有国人用这个库,请问你能使用saspy连接到远程的SAS server吗?谢谢

PistonIntJack commented 6 years ago

@PistonIntJack 你好,才发现也有国人用这个库,请问你能使用saspy连接到远程的SAS server吗?谢谢

你好,我是在本地使用的,没有用过远程

tomweber-sas commented 6 years ago

Just checking in. Have you had a chance to try a newer release? 2.2.8 or 2.2.9? There were some fixes I made that only showed up with translated versions of SAS which I hadn't previously been able to reproduce. Also, regarding the initial question (and title) of this issue. I now (at master) have the SASpid as an attribute of the SASsession object. And, you can programatically check if the SASsession is valid by checking SASpid not None:

>>> sas
Access Method         = STDIO
SAS Config name       = sdssas
WORK Path             = /sastmp/SAS_workA98D0000450D_tom64-3/
SAS Version           = 9.04.01M4D11092016
SASPy Version         = 2.2.9
Teach me SAS          = False
Batch                 = False
Results               = Pandas
SAS Session Encoding  = WLATIN1
Python Encoding value = cp1252
SAS process Pid value = 17677

>>> sas._endsas()
SAS Connection terminated. Subprocess id was 17647
>>> sas
Access Method         = STDIO
SAS Config name       = sdssas
WORK Path             = /sastmp/SAS_workA98D0000450D_tom64-3/
SAS Version           = 9.04.01M4D11092016
SASPy Version         = 2.2.9
Teach me SAS          = False
Batch                 = False
Results               = Pandas
SAS Session Encoding  = WLATIN1
Python Encoding value = cp1252
SAS process Pid value = None

if sas.SASpid:
    print(sas.SASpid)
else:
    print("SASsession object is invalid")
tomweber-sas commented 5 years ago

Closing this as no response. If there are further questions on this, just reopen.

Thanks! Tom