sassoftware / saspy

A Python interface module to the SAS System. It works with Linux, Windows, and Mainframe SAS as well as with SAS in Viya.
https://sassoftware.github.io/saspy
Other
372 stars 150 forks source link

saspy hangs after creating a SASsession #78

Closed wligtenberg closed 6 years ago

wligtenberg commented 6 years ago

I am trying to use saspy in our environment, which is a SAS grid. I have made changes to the config file in _personal.

I then try to use it with the following code:

import saspy
sas = saspy.SASsession(cfgname='iomwin')

The process then hangs. If I interrupt it, it seems to be working on self.sockin.accept()

When I change the path to not contain any jar files it fails with a Java error. So it seems that it is able to find the jar files and the java environment.

Any help is greatly appreciated.

tomweber-sas commented 6 years ago

Sure, happy to help. Can you show me your config file? And the whole log with errors you're getting? Are you running on Windows? I 'm guessing.

Thanks, Tom

wligtenberg commented 6 years ago

I am running Linux. Log and config (renamed s/.py/_py.txt/) attached. So just to be clear, the log I only get when I press Control-C, if I don't do anything, nothing happens.

sascfg_personal_py.txt keyboard_interrupt.log

PS: you are quick!

tomweber-sas commented 6 years ago

Ok, that all looks reasonable. So saspy is waiting on the java process to connect back (over the socket), which doesn't appear to be happening; that's where it's hanging. But, the java process seems to have started; guessing so only because if it didn't it should have produced an error and not hung. But, I can't be sure of that yet.

Since you're on linux, when you submit this and it hangs, can you issue: 'ps -ef | more' and see the saspy process as well as the java subprocess? I'm a little suspicious of the relative paths in the classpath (that's the only thing that looks out of the ordinary). Can you run the java command (from ps -ef), from a shell and see if it provides any better error that might not be getting back through to saspy?

BTW, what version of saspy? And, it's your Grid servers that are Windows I take it? That should all be fine, just getting the whole picture.

Thanks, Tom

tomweber-sas commented 6 years ago

I think what's going on is that saspy forks/execs java, which works, so there's not a failure at that point, but Java is subsequently failing for some reason, which happens to leave saspy waiting on the socket accept (which has no timeout parm, via the api), so saspy's hung at the point. I recently added a check for that which is at master, but not in the latest PIP (Pypi 2.1.7). If you grab the latest code from master, that might catch this case and produce dome diagnostic errors. But, you should be able to see the same thing (probably) if you can submit the same java command line from a shell yourself.

wligtenberg commented 6 years ago

I am seeing a defunct java proces:

350 32681 0 16:23 ? 00:00:00 [java] If I do this again (with full path), after retrying I see this proces: USERNAME 6436 0.4 0.0 35448932 36772 pts/3 Ssl+ 17:03 0:00 /usr/bin/java -classpath /home/USERNAME/test/saspy/lib/sas.svc.connection.jar:/home/USERNAME/test/saspy/lib/log4j.jar:/home/USERNAME/test/saspy/lib/sas.security.sspi.jar:/home/USERNAME/test/saspy/lib/sas.core.jar:/home/USERNAME/test/saspy/lib/saspyiom.jar:/home/USERNAME/test/saspy/lib/sas.rutil.jar:/home/USERNAME/test/saspy/lib/sas.rutil.nls.jar:/home/USERNAME/test/saspy/lib/sastpj.rutil.jar pyiom.saspy2j -host localhost -stdinport 55402 -stdoutport 53998 -stderrport 60520 -appname 'SASEtl - Workspace Server' -iomhost app0844 -iomport 28591 -user USERNAME and the saspy proces: 32681 9817 0 16:23 pts/7 00:00:01 /home//.local/share/virtualenvs/saspy-Unk2mtVe/bin/python3.5 /home//.local/share/virtualenvs/saspy-Unk2mtVe/bin/ipython As you can see I am using virtual environments. `saspy.__version__ shows: '2.1.7'` The grid servers are indeed Windows based.
tomweber-sas commented 6 years ago

OK, when you use the full paths it still hangs? Same traceback if you interupt? And what do you get if you just submit this from a shell?:

/usr/bin/java -classpath /home//test/saspy/lib/sas.svc.connection.jar:/home//test/saspy/lib/log4j.jar:/home//test/saspy/lib/sas.security.sspi.jar:/home//test/saspy/lib/sas.core.jar:/home//test/saspy/lib/saspyiom.jar:/home//test/saspy/lib/sas.rutil.jar:/home//test/saspy/lib/sas.rutil.nls.jar:/home//test/saspy/lib/sastpj.rutil.jar pyiom.saspy2j -host localhost -stdinport 55402 -stdoutport 53998 -stderrport 60520 -appname 'SASEtl - Workspace Server' -iomhost app0844 -iomport 28591 -user your-id
wligtenberg commented 6 years ago

Yep, full paths, still hangs. I just got the result from the command and it is the following:

java.net.ConnectException: Connection timed out (Connection timed out)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at java.net.Socket.connect(Socket.java:538)
        at java.net.Socket.<init>(Socket.java:434)
        at java.net.Socket.<init>(Socket.java:211)
        at pyiom.saspy2j.main(saspy2j.java:123)
Exception in thread "main" java.lang.NullPointerException
        at pyiom.saspy2j.main(saspy2j.java:130)
tomweber-sas commented 6 years ago

Hmmm. That's almost what I would have expected. Though it might be the same. I get

java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at java.net.Socket.connect(Socket.java:538)
        at java.net.Socket.<init>(Socket.java:434)
        at java.net.Socket.<init>(Socket.java:211)
        at pyiom.saspy2j.main(saspy2j.java:124)
Exception in thread "main" java.lang.NullPointerException
        at pyiom.saspy2j.main(saspy2j.java:131)

This is to be expected when your classpaths are ok and it's all working, but you don't have saspy waiting on the sockets for jave to connect back to.

The difference is yours shows: java.net.ConnectException: Connection timed out (Connection timed out) while mine shows: java.net.ConnectException: Connection refused

I'm not sure that that is really different, or not. Meaning it's showing some other problem. But that looks like your classpath and java are working. Can you try that again with the relative paths in your classpath, just to see if that was causing a problem before or not. Also, do you get the same python traceback when you try this with full classpaths and you interupt? If that's a different traceback, then we are onto a different problem.

The classpath could have been the first problem, and we're on to another possibly.

Thanks, Tom

wligtenberg commented 6 years ago

Error after keyboard interrupt is the same with the full paths.

The local paths also give the time out error when calling java directly. So it seems that that does not matter. But to be sure, I will use full paths for now.

tomweber-sas commented 6 years ago

ok. So that's really odd. Can you see what your /etc/hosts looks like? It appears that we are at the point where saspy started the java subprocess. The java process is up and running. saspy is waiting on java to connect to the sockets and java is trying to connect but not actually connecting. This is all on the same host, so there shouldn't really be a reason why the socket's aren't being hooked up between the processes. saspy leaves the hostname null when it creates the socket. This should allow all adapters, though you probably only have one anyway. Java does specify 'localhost' and the port it was told, so I wonder what your /etc/hosts looks like w/ regard to 'localhost'. It appears, java isn't connecting to the sockets saspy is waiting on.

Also, while this is 'hung' you could issue another command from a shell to see what this looks like: netstat -tpa that should show us the sockets saspy set up and see if there's anything funny about host names or something.

tomweber-sas commented 6 years ago

For the netstat output, you can do the ps -ef | cat to find the java command and see the ports on that command line, like in your output from above: -host localhost -stdinport 55402 -stdoutport 53998 -stderrport 60520

tomweber-sas commented 6 years ago

These are vm's. I also use vm's. My next thought has to do with how the network is configured for them. I run mine with the networking set to Bridged Adapter. These are linux vm's off my pc. The default when setting these up tends to be NAT. So, that's the next thought I'm having, as this really does seem like a strange place to hang.

wligtenberg commented 6 years ago

/etc/hosts contains the following:

127.0.0.1       localhost
10.3.66.130     app1300.infra.local     app1300
# The following lines are desirable for IPv6 capable hosts
::1     localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

Here is the netstat output:

tcp        0      0 *:36441                 *:*                     LISTEN      25287/python3.5
tcp        0      0 *:45180                 *:*                     LISTEN      25287/python3.5
tcp        0      0 *:55518                 *:*                     LISTEN      25287/python3.5

which matches the java command arguments -host localhost -stdinport 55518 -stdoutport 36441 -stderrport 45180

After restarting docker, I now get a connection refused error with the java call (just like you do). Not sure what else changed, I also ran it in a new bash session...

However, saspy still just hangs.

Yes, this is a VM, but not a local, it is managed with VM sphere. The network settings should be fine. I am also running some other webservice on there, which uses websockets, so I am pretty sure that it is connected to the network correctly. I asked our IT guy, and he says it should be fine.

I even tried on a new VM, which has nothing on it except for this. And I get exactly the same problem... weird

tomweber-sas commented 6 years ago

Wow, that all looks completely right. The sockets are available on any adapter and from any host. Of course, it's the loopback that this should connect on; localhost/127.0.0.1. This is all same host; we haven't even tried connecting to the IOM server yet. I don't know how the java code isn't connecting. Do you have time for a webex today? Probably the fastest way for me to diagnose this further. If not, I can still give you some things to try to help diagnose this further.

wligtenberg commented 6 years ago

@tomweber-sas I can make some time now, if that is possible on your end. I am available until 22:00 hours CET. Otherwise, my day starts again at 08:00 CET.

tomweber-sas commented 6 years ago

Yes, let me set up webex, I have to re-figure it out each time :) Can you email me, so I can send you the link to connect to. My email should be associated w/ my account.

wligtenberg commented 6 years ago

Commit https://github.com/sassoftware/saspy/commit/933d67bf81ca38721918c3a07e4ca3c7829630f2 fixes this issue.

kjnh10 commented 6 years ago

I've got the same error hanging at self.stdin = self.sockin.accept() at 431 in sasioiom.py My saspy.version is 2.2.7. and I think 933d67b have already been reflected.

Could you tell the resolution? I don't know how to fix it, though I have read through this issue.

Thanks in advance.

tomweber-sas commented 6 years ago

@kjnh10 Would you mind opening a new issue, you can refer to this one in it. I'd like to start fresh with your problem and see what is happening. Can you show me the configuration definition you're using (from your sascfg[_personal].py)? also, your /etc/hosts file. Are you on windows or Linux? Thanks! Tom