sassoftware / saspy

A Python interface module to the SAS System. It works with Linux, Windows, and Mainframe SAS as well as with SAS in Viya.
https://sassoftware.github.io/saspy
Other
375 stars 149 forks source link

SAS process has terminated unexpectedly #463

Closed rodrigoalatorre closed 2 years ago

rodrigoalatorre commented 2 years ago

We've built an app which, using SASPy connects to a SAS workspace server and retrieves information from two Oracle DB tables. Most of the time, we have no issued with SASPy, but sometimes, specially when the server seems to be under heavy load, SASPy throws an error, which, upon inspecting the logs reads: "We failed in getConnection No logical assign for filename _TOMODS1

SAS process has temrminated unexpectedly. Pid State = (1596, 256)"

To connect SASPy to the workspace server I am using IOM (which, again, works fine most of the time).

Does anybody has any pointers as to how to solve this?

tomweber-sas commented 2 years ago

Hey, glad you're able to use this for the app you've created. Sorry you're having this occasional trouble. Since the error is when trying to create a connection (based upon what you showed so far), I maybe wouldn't read too much into the other message, although, can you show me the real output itself (traceback and SASLOG if that's where you saw the no assign line), so I can really see what happened? Bit's and pieces of the full output aren't as helpful. There's usually an actual error message when the 'we failed in getConnection' happens. So the full output will help to see what the real problem is. So let's start with that. Thanks! Tom

rodrigoalatorre commented 2 years ago

Hey, thanks for your reply, Tom. I hadn't been able to replicate the behaviour as described, until today. Actually, I am not getting much else aside from the Pid State and the "No logical assign for filename" errors. This is the complete traceback I am getting:

We failed in getConnection
No logical assign for filename _TOMODS1.
We failed in getConnection
No logical assign for filename _TOMODS1.

SAS process has terminated unexpectedly. Pid State= (5076, 256)
No SAS process attached. SAS process has terminated unexpectedly.
No SAS process attached. SAS process has terminated unexpectedly.
INFO:werkzeug:172.17.0.1 - - [24/May/2022 13:55:40] "GET /api_route HTTP/1.1" 500 -
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 2095, in __call__
    return self.wsgi_app(environ, start_response)
  File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 2080, in wsgi_app
    response = self.handle_exception(e)
  File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 2077, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1525, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1523, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1509, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
  File "/var/dash/dash/API_CI_Flask/server.py", line 30, in query_ci
    res = query.sas_query()
  File "/var/dash/dash/API_CI_Flask/consultar.py", line 289, in sas_query
    datos_patrones = self.sas.sasdata(archivo_patrones, "home").to_df()
  File "/usr/local/lib/python3.8/site-packages/saspy/sasbase.py", line 1051, in sasdata
    if not self.exist(sd.table, sd.libref):
  File "/usr/local/lib/python3.8/site-packages/saspy/sasbase.py", line 915, in exist
    return self._io.exist(table, libref)
  File "/usr/local/lib/python3.8/site-packages/saspy/sasioiom.py", line 1160, in exist
    exists = int(l2[0])
ValueError: invalid literal for int() with base 10: 'No SAS process attached. SAS process has terminated unexpectedly.'

What weirds me out the most is that I am not even runnning a very complex query on the server, just a regular proc sql; with a bunch of where clauses on an indexed field. Even more daunting, if I copy and paste the code into my Enterprise Guide client and submit it, it only takes a couple of seconds to get the results. Any pointers?

tomweber-sas commented 2 years ago

Well, that's strange. The traceback shows that you're trying to assign a SASdata object (sas.sasdata()) yet the error is that we failed in get connection; that's in the java IOM client code trying to establish a connection to the workspace server. To be executing sasdata(), there would already have to have had a connection in the first place; you couldn't really get there otherwise. So, then is this something where you have a long running session, that could have gone away out from under you by the time you're trying to submit this request? It smells like that from this traceback.

Can you show me your configuration, just for the sake of argument, and the code you're running or maybe a description of what you're doing if it could be something like your connection timing out before you get this failure or something?

thanks, Tom

rodrigoalatorre commented 2 years ago

Sure, Tom (thanks again), this is my configuration file:

SAS_config_names=['iom_com']
iom_com = {
              'java': "/usr/local/openjdk-17/bin/java",
                      'iomhost':'server_ip_address',
                      'iomport': 8591,
                      'encoding':'windows-1252',
              'appserver':'SASApp - Workspace Server',
              'omruser' : username,
              'omrpw' : pwd
                     }

This is basically the code we're running:

options nocenter;
options threads cpucount=16;
options sortsize=64G;
options SUMSIZE=64G;
proc sql; 
            create table work.patrones as
            select EMPLOYER_ID, count(distinct PERSON_ID) as tamaño_patron
            from oracle.oracle_table 
            where (EMPLOYER_ID = '19180837' or EMPLOYER_ID = 'D9231195' 
 or EMPLOYER_ID = 'Z8770294' 
) and start_date <= '24May2022:00:00:00'dt
            and end_date >= '27Aug2020:00:00:00'dt
            group by EMPLOYER_ID;
tomweber-sas commented 2 years ago

ok, thanks, I see you're using IOM (not COM; which is good). And you're not providing your own classpath (which you shouldn't). That confirms you're not using an old jar file with newer code or something strange like that which is something I've run into a couple time, a while back and can be really strange to try to diagnose. The SAS code isn't so much the thing as the SASPy code I was wondering about. Do you create a SASsession object and then use it for whatever requests your trying to submit, or do you have a long running session you try to use across multiple requests, such that it could have timed out and went away out from under your app?

rodrigoalatorre commented 2 years ago

Oh, sorry for the misunderstanding. Actually, I create a SAS Session for each request (I also tried creating a new session for this section, but it didn't seem to make much difference). Now that you've said it, I think that this might be related to the session timing out before the code finishes running. Is there any way to increase the timeout?

tomweber-sas commented 2 years ago

Well, IOM itself (configured in metadata for your workspace sever definition) has an inactivity timeout, which will shut down the server if that's hit. But if you're actively running code, that timeout won't happen. Do you have SAS Admins that can look at the object spawner logs to see what is happening to your sessions? If it's just gone, then I don't really have anything on the Python side as far as why it went away or things like that, I'm afraid.

rodrigoalatorre commented 2 years ago

I'll contact the SAS amins, ask for the logs and get back to you. Thank you a lot, Tom, I really appreciate it.

R.

tomweber-sas commented 2 years ago

Cool, I wish I had a better idea of what could be happening. The other thought I'm having is if you might be doing something in your app where (you said you're creating multiple connections), you may be trying to make a request on a session variable that's not a valid one anymore? Or something like that?

tomweber-sas commented 2 years ago

checking back in. Were you able to get any more info on this? Logs or double checking the logic in your app regarding which SASsession you were trying to use?

tomweber-sas commented 2 years ago

Hey, this one is pretty old and I don't have the info from the server side to know what's going on. Looking to clean up old issues, so is this still an issue, and if so, can we get object spawner logs and workspace server logs from a failure case to look at? Might need tech supports help on that side if that's not something you can easily get. Or did you look at the logic of your code and find anything where you might be using the wrong SASsession object for the case where it fails? That was a possibility, especially given where it was failing (from that first traceback). Thanks, Tom

rodrigoalatorre commented 2 years ago

Hey, sorry for the lack of reply, Tom. We actually migrated to a new workspace server running Linux and the problem hasn't come up again. Thank you for your support, I'll mark this as closed :)) Thank you again, Rodrigo

tomweber-sas commented 2 years ago

Hey, that's great news! I really rather prefer knowing what was going on, but we don't always get what we want :) Glad it's working now as expected! Let me know if you need anything else! Thanks, Tom