sassoftware / saspy

A Python interface module to the SAS System. It works with Linux, Windows, and Mainframe SAS as well as with SAS in Viya.
https://sassoftware.github.io/saspy
Other
375 stars 149 forks source link

Proper handling of the SAS %abort macro #294

Closed mtoma closed 4 years ago

mtoma commented 4 years ago

Describe the bug I can't seem to find a properly supported way on interrupting a sas code submited with the sibmit() method. The log is overwritten by the following message: SAS process has terminated unexpectedly. Pid State= (30851, 256)

To Reproduce

sas = saspy.SASsession(cfgname='b2b_sas_server')
res = sas.submit('''

%macro runquit;
    %abort RETURN;
%mend runquit;

%put ********************** SOME LOG I WANT TO ACCESS ******************************;

%runquit;
'''
);
print(res['LOG'])

Expected behavior Be able to access the log of the execution until the %abort statement

Desktop (please complete the following information):

Additional context %abort ABEND has the same behaviour %abort CANCEL is the worst case scenario as the submit() method just never returns

tomweber-sas commented 4 years ago

Hey, well, yeah, the %abort macro kills SAS right then and there. There won't be a log returned from that code, as the SAS process terminates and saspy no longer is connected to a SAS session. It's the same as if you were running SAS interactively, in DMS and you submitted that, The DMS process would go away, and it wouldn't be there for you to try to see the log, or anything, because the entire process is not there anymore and the DMS window are gone.

With the STDIO and IOM access methods (at least), you should still be able to see the session log using the saslog method (the log up to before you killed SAS at least): print(sas.saslog()) because that has been cached in the SASsession objject. I don't think I cache it in the HTTP access method. Out of curiosity, why would you be submitting %abort in an interactive SAS session, then expecting to still be able to interact with it? If you're trying to provide some kind of return code back to python for the code being submitted, you can easily do that be setting a macro variable and using the symget() method to see what it was after your submit(). instead of %abort CODE; use %let rc=CODE;

Then after

res = submit()
sas_rc = sas.symget('rc')
# and res has the LOG and LST in it now too

Then if you want to kill SAS,

f sas_rc == 'whatever_rc_to_kill_SAS':
   sas.endsas()

Does that make sense, and is the above what you were trying to do in the first place? Tom

mtoma commented 4 years ago

Hi Tom, Thanks for the quick answer! My use case is the following. I'm working on automation of third party .sas files (thousands of lines of sas code in each file). Those files are coded in a rather strange way where they have dozens of those runquit macros all over the place. The full version is this:

%macro runquit;
  ; run; quit;
  %if &syserr. ne 0 %then %do;
     %abort;
  %end;
%mend runquit;

The idea is to stop the processing when an error occurs.

I was kind of hoping to be able to just read an submit those files and write the log back to the sas server after reading it from SASPy.

So yes I can split the code on those %runquit macro calls and submit the pieces one by one and check the log in the python code, but I wanted to avoid to alter the sas code as much as possible.

My fist attempt was to change the %abort CANCEL by som other type of %abort call because I saw in the sas docs that %abort CANCEL does a "terminates the entire SAS program and SAS system.", but this doesn't seem to be the case for %abort ABEND nor %abort RETURN, so the fact that I'm not getting the log came as a surprise.

I forgot to mention that I'm using the Java IOM access method.

tomweber-sas commented 4 years ago

Ah, IOM, yes that helps. I just finished reading the doc on this too, and it behaves slightly differently in various case. I'll have to actually play around with this to see what is happening in the cases where it isn't supposed to terminate SAS. There are a couple cases where it says it should just stop the current processing and still be able to continue running. Are you connected to a remote workspace server or just running a local SAS install? I don't know if that behave identically in all cases. Let me see what I see when trying all of these various combinations. For anything that terminates SAS out from under me, I can't do much about that. But, for these cases that say they are supposed to keep going, maybe there is something I can do if they are hanging; need to try and debug it to see what's going on. Tom

mtoma commented 4 years ago

Yes I agree with you that %abort CANCEL does what it does, kills the SAS session and that's it. In this case the only serious issue is that the submit() method never returns and just hangs forever. If his behavior is impossible to avoid it should be added to the docs of the method I think.

For the other %abort types if I read the docs right it might be possible to extract the log but they might as well be lost by the Java IOM layer.

I'm connecting to a remote workspace sas server.

tomweber-sas commented 4 years ago

Well, I tried the last macro you send that just does %abort; With no other command. That works just fine both IOM remote and local.

sas.submitLOG('''
%macro runquit;
  ; run; quit;
  %if &syserr. eq 0 %then %do;
     %abort;
  %end;
%mend runquit;
''')
11                                                         The SAS System                              17:00 Tuesday, April 21, 2020

105        ods listing close;ods html5 (id=saspy_internal) file=_tomods1 options(bitmap_mode='inline') device=svg style=HTMLBlue;
105      ! ods graphics on / outputfmt=png;
NOTE: Writing HTML5(SASPY_INTERNAL) Body file: _TOMODS1
106        
107        
108        %macro runquit;
109          ; run; quit;
110          %if &syserr. eq 0 %then %do;
111             %abort;
112          %end;
113        %mend runquit;
114        
115        
116        ods html5 (id=saspy_internal) close;ods listing;
117        

sas.submitLOG('''
data a;x=1;output;x=1/0;output;

%runquit;
''')
15                                                         The SAS System                              17:00 Tuesday, April 21, 2020

151        ods listing close;ods html5 (id=saspy_internal) file=_tomods1 options(bitmap_mode='inline') device=svg style=HTMLBlue;
151      ! ods graphics on / outputfmt=png;
NOTE: Writing HTML5(SASPY_INTERNAL) Body file: _TOMODS1
152        
153        
154        data a;x=1;output;x=1/0;output;
NOTE: Division by zero detected during the compilation phase, detected at line 154 column 22.
155        
156        %runquit;

NOTE: The data set WORK.A has 2 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

ERROR: Execution terminated by an %ABORT statement.
157        
158        
159        ods html5 (id=saspy_internal) close;ods listing;
160        

And I'm seeing that RETURN is the same as ABORT; both terminate SAS out from under you. Looking at CANCEL next, as that says it's supposed to return and I see that it hangs. I may be able to do something there, but that depends upon what's actually going on. I think it cleared the rest of the code I submitted following what you had in the submit method, which would cause a problem. I can hit attention, and I get my message that interrupt handling isn't in the IOM access method, like it is in STDIO. It's at that point if you choose Continue, I ought to re-submit my tail code which should keep this from hanging, but I 'll have to actually debug that as it's in the java code too. I'll look at this tomorrow.

For now, can you confirm you see the same thing? %abort; works and you get the log back %abort ABEND; you're toast; no luck %abort RETURN; you're toast; no luck %abort CONTINUE; it hangs, but shouldn't; I need to debug

Thanks! Tom

mtoma commented 4 years ago

Sorry I made a too quick cut&paste. The original macro is:

%macro runquit;
  ; run; quit;
  %if &syserr. ne 0 %then %do;
     %abort CANCEL;
  %end;
%mend runquit;

and not juste %abort;

So the istuation is as you describe yes: %abort ABEND; stops as it should but I loose the log %abort RETURN; stops as it should but I loose the log %abort CANCEL; it hangs and it shouldn't

%abort; does exactly what I would like to have it do but here it is a SAS problem because %abort; without parameters just aborts the current code unit (here the macro), emits ERROR in the log and continues execution. So it basically completely misses the point of stopping the whole script when an error is triggered (can be observed by adding a %put some log after the macro call). I can't use it for my use case.

tomweber-sas commented 4 years ago

ok, I typed CONTINUE, but it's really CANCEL, so but that doesn't change the behavior. So, CANCEL states that, for this interactive session, it only stops the current submit (I guess that's an IOM submit, which would be all the code you submitted in saspy submit) then continues (guess that's where I got that in my head), and the rest is unaffected. But, I have to debug the java code to see what I'm actually getting. I do resubmit my tail code after the interupt, but it's still hanging anyway. So, I'll debug that tomorrow.

If the method of operation is workspace server and stored process server, use the CANCEL option to do the following:

It only clears currently submitted program.
Other subsequent submit calls are not affected.
The error message is written to the SAS log.

If it was a batch job, it would have terminated SAS, same as ABORT and RETURN. I'm pretty sure these programs you're trying to run were submitted as batch jobs originally. The log would be written out to a file, so it would have that last bit before SAS off'ed itself.

So, just to come back to what you want. You want to use CANCEL, have it stop processing everything right then, but come back to saspy with the log from that and be able to keep running or issue endsas() then yourself if you want? Is that correct?

Thanks! Tom

mtoma commented 4 years ago

Yes its CANCEL I didn't notice it either. Fixed in my previous comment. Well the original problem is even more complicated. The original code redirects the whole log to a file with a proc printto log=PathToLogFileOnServerStorage ; This line of code makes SASPy hang. I tough it might be normal as it will never get any log back so was trying to handle the log storage using SASPy's submit() method and I was planning to upload it back to the remote SAS server.

The get you the whole picture the programs are part of .egp files where they are basically just run one after the other, I'm trying to port all the cumbersome scheduling part to Apache Airflow.

So, just to come back to what you want. You want to use CANCEL, have it stop processing everything right then, but come back to saspy with the log from that and be able to keep running or issue endsas() then yourself if you want? Is that correct?

Yes that is exactly what I was planning to use.

But the best of all would be to just not bother with the log in SASPy at all and have the following snippet not hang and just return a status indicating it didn't crash and let me proceed with a submit of another sas file:

filename LogFilePath "/SERVER/FILESYSTEM/PATH/LOG_FILENAME.log" new ;
proc printto log=LogFilePath ;
run;
tomweber-sas commented 4 years ago

Ah, yes, that is kinda the worst case. If you redirect the log, that's like cutting the legs out from under saspy. It needs the log to be able to work as, other then the submit method, it generates SAS code and runs it to get the results and it parses things from the log to know what's going on, and for submit(), it doesn't know what you ran, so it also has to wrap your code with other code the know what happened and do what it does.

These .sas programs you've got are meant to be run as batch scripts, where doing all of these things (writing the log to some file, killing SAS conditionally, ...) makes sense. saspy isn't a batch script submission tool though, rather an interactive interface, which wasn't designed to submit batch jobs. But, enough whining about that :) Let's see what I can come up with so you can, hopefully, accomplish what you are trying to do. No guarantees, but I'll see if we can figure something out.

I need to debug and see why the CANCEL hangs. I'll work on that today. As for the proc printto, that's no good, but I wonder if you add the 'undo' for that as the last like of your code you submit, if it might happen to work.

proc printto log=... ;run;
sas code ...
proc printto;run;  /* this sets it back to normal, before my wrapper code then gets run, so I may get what I need */

I haven't tried this yet either but will play around with it while I look at the other problem.

Tom

mtoma commented 4 years ago

Yes I was suspecting that the log might be required by SASPy to work, that's the reason I didn't submit the issue with the "printto log=..." problem. For the "proc printto;" at the end of the script this won't work for me because I have one script to redirect the log another to run the program and a third one to redirect the log to STDOUT again at the end. That's very ugly and I prefer to handle the whole log properly from SASPy . I perfectly understand that SASPy is not a batch submission framework, so any help of any kind will be greatly appreciated. For now I have a kind of a fragile workaround by replacing the %abort CANCEL; with a simple %abort; and splitting the big file on every call to %runquit; I can reasonably identify as being called from free code outside of any macro. If any kind of real %abort could just interrupt and get me back the log that would be perfectly sufficient.

Michal

tomweber-sas commented 4 years ago

ok, I debugged the java code and see why it hangs, even w/ an interrupt + Continue. So, I made a tweak to work around this problem, which is that the ABORT CANCEL terminates the rest of what was submitted (the behavior you want), but that includes my tail end wrapper code so I know the what happened (the end of the log I need to know I got everything). I just added that code to a second (IOM) submit, duplicating that, but that won't be a problem in the normal case. It is however, a different IOM submit from the first, so, because CANCEL terminates everything else in that first submit, but allows subsequent submit to continue to work, I get my tail end of the log from that second submit, and this appears to work seamlessly.

I've build a new saspyiom.jar, which I've pushed to a new branch: abort. Can you grab that jar and swap it out with the one you have (or uninstall and install from that branch: pip install git+https://git@github.com/sassoftware/saspy.git@abort) and see if this solves the abort cancel problem for you.

AND, guess what. I just added in a proc printto;run; into that second submit, before my tail code, so I think this jar may allow you to do a saspy submit with your whole program, including the proc printto and the abort cancel and actually do what you want!

I'm not cool yet with making this change in the real saspy jar yet. But give it a try and lets see how it works for your situation. If it solves all of this for you, that's great, and I'll see where we go from there. If not, then we'll see what's next and go from there also.

Crossing my fingers on this one. :) Tom

mtoma commented 4 years ago

Hi Tom, Well it works perfectly! Thanks a lot! I tested with %abort CANCEL; got the code interrupted at the right place and got all the log until the %abort call exactly as I was hoping to have it. Also tested the "proc printto log=... ;run;" statement with your fix. Also worked perfectly and didn't hang anymore and I also got all the log until that statement returned byt the submit() method.

So with this fix I would say I can just submit my whole code without changes and get it executed without problems. That's great!

I understand pushing this to the main branch might require some thinking. My opinion on this is that having the submit() method always return, even if it requires some undercover code executed, is better than having it just hang indefinitely but I can perfectly make a fork or use the github branch to install SASPy so take your time to ponder the consequences of making this part of the main branch.

tomweber-sas commented 4 years ago

Wow, that is awesome! One truth I've learned in over 3 decades designing and implementing many parts of SAS is that it doesn't matter how you thought users were supposed to use what you built, they will inevitably use it to accomplish something they are trying to do, in a way other than how you designed it to be used! That's ok. I always try to give users a way to accomplish their goals (within reason :) ) if it something that can be done. This is another one of those cases.

As for getting this into master, there a few things I need to investigate. First is that it would cause no regression for anyone else using saspy; as intended or other :)

The other thing is that I want to have all of my access methods be able to support this in the same way. STDIO (over SSH or not), IOM (local and remote) and HTTP all support everything saspy can do, each in their own way, but equivalently. That's no small task sometimes. I've actually been trying this out in STDIO and I'm seeing that %abort CANCEL; behaves differently in that access method. It isn't processing anything I submit after, but it didn't kill SAS either. That's probably because of the 17 different ways this statement behaves depending on how SAS is running and which way you code it: https://go.documentation.sas.com/?docsetId=mcrolref&docsetTarget=p0f7j2zr6z71nqn1fpefnmulzazf.htm&docsetVersion=9.4&locale=en Still, that's not how this works in the IOM case. Haven't even looked at HTTP yet.

So, yeah, for now, just use that jar, it's good to go for you. I'll follow up on the rest of these concerns for me to figure out. I'll leave this issue open till I decide how to proceed and I'll keep this branch around to till I have the final answer too.

Really glad this works for what you are trying to do! Tom

mtoma commented 4 years ago

Yes, the client I'm working for uses SAS in a catastrophically wrong way, as en ETL tool. Not that I can easily use SASPy to run their SAS scripts there his hope that I'll be able to prevent the situation to become even worse. As far as I'm concerned I have exactly what I need. Shall I close the issue or do you prefer me to keep it open to keep track of all this?

tomweber-sas commented 4 years ago

Sorry, been pulled in too many other directions, but I've investigated this:

So, I'm glad this has solved all of your cases, that is great. I have investigated my other two access methods and they are at either end of the spectrum. So, they won't all behave the same, like I would prefer. Here's a rundown, so it's document here (for me if nothing else :) ):

HTTP - all API, so it doesn't hang in either proc printto or %abort case.

IOM - Hybrid of API and my own interface

STDIO - no API at all; completely my own interface

So, %abort CANCEL is ok for HTTP and IOM (with my fix, which is fine), but won't be able to work in STDIO due to the SAS implementation of the feature.

using proc printto in saspy is a problem in all cases, other than it doesn't cause a hang in HTTP; still breaks saspy though, in general, if you submit anything other than only submit().

I have 2 thoughts on how to proceed. As for proc printto, I don't like having to submit the undo (proc printto;run;) on every submission of code ever sent to SAS, just so that in the rare case a user redirected the log in code they submitted, I would undo that and give it back to saspy. I feel that's like carrying a sledge hammer around with you everywhere you go, every day, in case you came across a finishing nail sticking out some, so you can tap it back in all the way.

You can't use printto to redirect the log out from saspy, beyond the scope of one submit(). No matter the access method. So, the fact is you CAN use that in a submit(), but then you need to add the undo yourself at the end of the code you submit. Then it works fine in all access methods, and you aren't surprised on your next submit that the log isn't still redirected because I undid it out from under you, behind the scenes, like I don't like having to do. Hope that sentence makes sense. I know I haven't documented this, but that seems like the best way to go; document these, and any other limitations and how to work around them?

Since you can now submit an entire program, assuming your reading it in and submitting it? Can you do the following and have that solve the pritto issue?

for file in list_of_code_files:
   fd = open(file); code = fd.read(); fd.close()
   sas.submit(code+"\nproc printto;run;")

For the %abort CANCEL;, that is fixed and works in IOM and HTTP, so you don't have to do anything about that. STDIO just won't support that, since SAS doesn't support that, even though the dac says it should. I can't fix that. But I can document it.

The second thought would be to create a new method for this; submitBATCH(sasfile=) which would be used for this kind of thing, and could take a file and run it, and that method would add the proc printto;run;, and any other extra things that we might find in the future that can't be supported quite right, to that one method so it can be used specifically for this kind of thing, leaving the existing code paths unchanged.

What are your thoughts on either of these two? Guess they really boil down to the printto issue really, since abort is fixed for HTTP and IOM and nothing can be done for STDIO.

Thanks! Tom

mtoma commented 4 years ago

Hi Tom,

All this seems perfectly reasonable to me. The %abort CANCEL was the main problem and this one is fixed and it seems it is possible to fix it in a reasonably generic way.

For the "proc printto log=;" issue in fact I'm already working around this by concatenating 3 files in python to make sure I end up executing the "proc printto;" at the end.

It's not easy to guess what can happen in all possible combinations of "proc printo" and "%abort" calls or some other undiscovered obscure use cases, so I would say having a nice transparent submit() method that serves to submit pieces of code where you are supposed to know what you are doing and accept consequences of a "proc printto log=;" is the wqy to go. With a clear note in the documentation about this issue for the submit() method I think even a hang is acceptable here. If this note points to a submitBATCH() method that makes all that is reasonably feasible to keep control of the session and not get killed off by "proc printo log=;" or some other log manipulating calls by ensuring that the log goes to STDIO at the end of the execution of the call that would be awesome.

As a bonus I think it would make the API very usable:

  1. You know what your code does: use submit()
  2. You have a large batch statement with unknown code possibly manipulating the SAS log: use submitBATCH()

Nice and clean!

Michal

tomweber-sas commented 4 years ago

thanks Michal! My plan (at the moment) is to add a section in my documentation, not unlike the troubleshooting guide, that is a limitations section. addressing the requirement of saspy getting the log being the first item. Stating that you are free to route the log to some file in a submit() invocation; that is completely fine in all access methods, as long as you have 'proc printto;run;' (undo) in that code (at the end), so that the log is given back to saspy before returning from that submit. I don't think it's unreasonable to say that if you code up printto in your code you submit, you add the undo at the end and we're all good.

Addressing the %abort CANCEL; for the STDIO access method (works in the others w/ no intervention with my fixes), doesn't work as documented, but that's a SAS bug which isn't going to get fixed, so don't run that in your code if you're using STDIO.

Adding things to this as they come up is just a good, one place to look, thing I should have had since the beginning.

As for a submitBATCH,(), I'l keep that in mind, but as for now, it would be a lot of effort for noting more than adding +'proc printto;run' to the code in your submit(). If more things come up to make that seem worthwhile, I'll keep it as an option.

Again, I'm glad we've got this all working for you, and it will be helpful for others. Thanks for helping me make saspy better and more useful!

Thanks, Tom

tomweber-sas commented 4 years ago

Cripes, I think I'm losing my mojo being locked up at home! I wrote that whole section of doc and even pushed it out there, only to realize there was still one case where I have to issue the printto undo. internally, to support the case where you don't have to modify the code you're submitting at all. I thought I could document my way out of it, and I did/can, but it would require you to modify code in the case of printo and abort cancel - requiring you to submit the undo right before the cancel. But, to allow no .sas code changes on your end, I have to do the undo on my end to cover that case.

Guess I'm gonna get over carrying around that sledge :)

Soooo, I pulled that doc, left my undo in my saspyiom.jar (put it back after removing it) and am even going to add it in to the other access methods so this is handled for all cases it can be. Then I'll rework the doc and put it back out there (it'll be shorter with less restrictions :) ).

I've merged in the 'abort' branch to master, and I'm gonna finish the rest of this in the other access methods, before building a new release.

I've been on other things a bit; other issue w/ append and another 'bug' in SAS itself I have to work around for some cases in STDIO, and some other things here I have to attend to. I'm gonna blame that, as well as being sequestered and not being allowed to go play ice hockey for this :), but it all just falls back on me anyway lol.

Luckily, what I did already fixed your cases , so all of this is for other cases and access methods. Also, I'm not going to add a submitBATCH() like was another option, as I'll just make the existing submit handle this. So that's just easier from a user point of view.

Tom

tomweber-sas commented 4 years ago

@mtoma I've worked up a new branch; abort2. I've traded in the sledge for a compact ball-peen :)

I reworked the code the same as I would have for a dedicated submit (submitBATCH) like I threw out there. What that means is that none of my internal code submissions submit the 'undo' It is only ever done for a user issued submit(). So, that addresses my concern about running undo all the time. I've added an option on the submit methods (submit[LOG | LST] ) ', printto=[True | False] and the default is False - that is the currently existing behavior. I really don't want to change default behavior, even if it seems safe to do. So, to get the new behavior you just run submit(code, printto=True) and I'll internally generate and submit the undo following the code you provided; in a second API submit for those access methods with the API.

So, this implementation addresses my issues with doing this all the time, and has no changes to user code or saspy interface other than being able to speciy printto=True on any of the submit methods to address having a proc printo and even an abort CANCEL in the submitted code. Effectively what the submitBATCH() idea would have been; just chaged to be an option on submit instead of another method.

Can you pull this branch and try this out? Verify that it still fixes all of your cases? Yo should be able to submit your .sas files as is when setting printto=True. And, it's shouldn't ever hurt to set this to True for any code either, FWIW.

fd = open('MyBattchSript.sas'); code = fd.read(); fd.close()
sas.submitLOG(code, printto=True)
# the rest of your Python program ...  you are covered either way

I was going to name the option batch=, but I already have Batch mode for saspy, and that's something completely different, so using batch= (or even submitBATCH() ) would have been confusing. Since the prtintto undo is the only thing it does currently, I just named it printto=, If you have any idea of a better name, let me know.

If you can verify this still addresses your cases, and don't mind specifying an option for these cases so they work w/out issue, I think this will be the final solution for this. I'll need to test this more fully, along with other changes I have at master, before building a new version with all of this, but it's looking pretty clean.

Thanks! Tom

tomweber-sas commented 4 years ago

@mtoma I've merged this branch into master. Working on creating the next release with this and the other fixes at master. Setting printto=True on your submit() should give you the same behavior as the original change I built for you. I'll close this when I create this next version with this and everything else in it.

Thanks, Tom

mtoma commented 4 years ago

Hi Tom, sorry for the late answer but I'm rather super busy these times. The option solution seems perfectly ok. So where should I be testing it now? Still on the abort2 branch or should I take the .jar for master?

tomweber-sas commented 4 years ago

Hey, no problem at all. I tested it enough to believe it'll solve your cases. I just built a new version 3.3.6, so you can install that from pypi with a simple pip uninstall -y saspy pip install saspy and it'll get 3.3.6. The conda version will build in a little bit, it's got it's own trigger from my push to pypi and will CI/CD build itself later today. Of course, what's here at master is the same code, so any way you want to grab it. Do be sure you are using the new saspyiom.jar that is in 3.3.6, so it matches the pythoncode; as it did change from what I originally prototyped for you.

Yeah, this is as simple and clean as it can get. I think it's a good minimalistic way to go. Don't add that undo code unless you ask me to, then it's addressed fro you. I also added that section of doc, if you feel like it, you can let me know what you think of it if you get around to it. https://sassoftware.github.io/saspy/limitations.html

sas.submit(code, printto=True) sas.submitLOG(code, printto=True) sas.submitLST(code, printto=True)

I'll close this now, and just reopen or open a new one if you have any problem, questions, ...

Thanks again! Tom