Open elshimone opened 1 year ago
I recommend not to use the processor as a context if you are using Saxon in a server setting where you are keeping alive the app. This is because the release() function will be called preventing it to be used again in that process.
See note here: https://saxonica.plan.io/issues/4942#note-33
In SaxonC 11.4 we have made a number of improvements to handle the problem you have reported.
Hi @ond1 thanks for the note. Are there plans for saxonica to release a python wheel package for saxon themselves?
Hi @elshimone Yes we have plans to do so. We will be releasing an official python wheel packages for SaxonC 12 (Linux, MacOs and Windows)in the near future. We have successfully gone through a phase of testing of the wheels for the next release. Also in SaxonC 12 we have replaced the support of Excelsior Jet JVM with Graalvm native-image.
We have moved away from the use of JNI therefore you will not see the failure that you reported above (JNI_CreateJavaVM() failed with result).
That's great news - if you need any beta testers let me know.
Simon
On Mon, 14 Nov 2022, at 9:29 AM, ond1 wrote:
Hi @elshimone https://github.com/elshimone Yes we have plans to do so. We will be releasing an official python wheel packages for SaxonC 12 (Linux, MacOs and Windows)in the near future. We have successfully gone through a phase of testing of the wheels for the next release. Also in SaxonC 12 we have replaced the support of Excelsior Jet JVM with Graalvm native-image.
We have moved away from the use of JNI therefore you will not see the failure that you reported above (JNI_CreateJavaVM() failed with result).
— Reply to this email directly, view it on GitHub https://github.com/tennom/saxonpy/issues/5#issuecomment-1313367957, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFKOX3WDQEPP3EV5DWHZKDWIIBBNANCNFSM6AAAAAAR52BBSY. You are receiving this because you were mentioned.Message ID: @.***>
Thanks for your offer. I will contact you soon
@elshimone Thanks for reporting the problem. In our similar use case, we collected multiple input data and ran them all on the same process instead of creating separate processes on each input data.
@ond1 It's excellent news that you guys are going through tests for releasing the wheels. Are you guys going to have different wheels for the open-source version and enterprise versions? Or you will have different access levels on the same wheel? Some lower-level configurations like JVM environment settings via Python would be great as well.
We have different wheels for the open-source and enterprise versions
@ond1 Great to hear you are working on releasing the wheel packages. we are utilizing PySaxonProcessor in aws lambda and we are facing the almost same problem that is posted here: https://saxonica.plan.io/issues/4942. During the load test found we are keep getting Error: No stylesheet found. Please compile stylsheet before calling transformToString or check exceptions and then finally JET RUNTIME HAS DETECTED UNRECOVERABLE ERROR: runtime error. Wondering if there is anything i can do interim, also do you by chance have the timelines when official wheel package will be ready?
Hi @gouripv is it possible for you to upgrade to SaxonC 11.4? I know with that release it is not built in a wheel, but you can build and install the Python extension.
We will try to push a beta release, but I would have to get back to you on when that will happen.
Hi @ond1 Thank you for the reply. Right preference is to use wheel package, we are currently using saxonpy. But let me cross check once on 11.4, can you please share documentation on it if you have it handy.
Yes I understand installing the wheel is so much easier. Given a few extra steps see documentation on installing the Python extension here: https://www.saxonica.com/saxon-c/documentation11/index.html#!starting/installingpython
You will need to install SaxonC: https://www.saxonica.com/saxon-c/documentation11/index.html#!starting/installing
Hi @ond1 I have another question for you. Below is my sample code I have for transformation currently for my python based aws lambda. I currently have 2048MB for my lambda and as the requests come in, memory is keep increasing and when the threshold is reached keep getting getting the error as Error: No stylesheet found. Please compile stylsheet before calling transformToString or check exceptions. Are there any options for the time being I can do without upgrading to 11.4.
def transform_xml(input_xml,xsl_path):
try:
proc = PySaxonProcessor(license=False)
xsltproc = proc.new_xslt_processor()
ROOT_PATH = os.path.abspath(os.path.dirname(__file__))
file_path = os.path.join(ROOT_PATH, xsl_path)
f2 = open(file_path, 'r')
data_xsl = f2.read()
document = proc.parse_xml(xml_text=input_xml)
xsltproc.set_source(xdm_node=document)
xsltproc.compile_stylesheet(stylesheet_text=data_xsl)
final_xml = xsltproc.transform_to_string()
if final_xml is not None:
return final_xml
else:
raise Exception('Exception occured during transformation')
except Exception as error:
raise
finally:
f2.close()
xsltproc=None
proc=None
For me, with
statement is preferable over try
, but let's stick with your way. Is it an option for you to simplify the code with specifying the file paths directly like this, so that you don't need to handle all the file reading and others?
proc = PySaxonProcessor(license=False)
xsltproc = proc.new_xslt_processor()
xsltproc.compile_stylesheet(stylesheet_file=xsl_path)
final_xml = xsltproc.transform_to_string(source_file=input_xml)
If you keep getting the error No stylesheet file
, maybe check if you really have the file in the path that the code can access. Your files may need to be in the exact containers or running environments instead of the host file system. Maybe check how you specify files paths in lambda.
I appreciate your inputs @tennom. Well I don't think it is causing issue because of the file paths and reading the files, the error is happening after reaching some threshold for the memory. If it is file path issue it should fail in the first go for all requests where it have served almost 1k+ request without any issues. I guess as I am not using with it is not cleaning up the proc resources and memory is keep increasing. I can not use with as well because i get this error JNI_CreateJavaVM() failed with result: -5. I tried to define porc and xsltproc at the global level meaning before my handler in the lambda, but even then memory is keep increasing.
@
From: gouripv @.> Sent: Friday, November 18, 2022 9:26:13 PM To: tennom/saxonpy @.> Cc: Tennom @.>; Mention @.> Subject: Re: [tennom/saxonpy] Can only create PySaxonProcessor once per process (Issue #5)
I appreciate your inputs @tennomhttps://github.com/tennom. Well I don't think it is causing issue because of the file paths and reading the files, the error is happening after reaching some threshold for the memory. If it is file path issue it should fail in the first go for all requests where it have served almost 1k+ request without any issues. I guess as I am not using with it is not cleaning up the proc resources and memory is keep increasing. I can not use with as well because i get this error JNI_CreateJavaVM() failed with result: -5. I tried to define porc and xsltproc at the global level meaning before my handler in the lambda, but even then memory is keep increasing.
— Reply to this email directly, view it on GitHubhttps://github.com/tennom/saxonpy/issues/5#issuecomment-1319992485, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACCVVBFE7IB32U7IK76OUYTWI57XLANCNFSM6AAAAAAR52BBSY. You are receiving this because you were mentioned.Message ID: @.***>
Using SaxonC 1.2.1 as @tennom suggested passing the source as file name as argument to _transform_tostring can help with memory:
final_xml = xsltproc.transform_to_string(source_file=input_xml)
Thank you @tennom and @ond1 for your inputs, i can not really use xsltproc.transform_to_string(source_file=input_xml)
because input xml comes in as an input event that lambda receives. Tried using xsltproc.transform_to_string(stylesheet_file="test1.xsl", xdm_node= node)
but giving me the error as "saxonc.PyXsltProcessor' object has no attribute 'setSourceFromXdmNode'"
Looks like a bug. You could try using the PyXslt30Processor class instead
SaxonC 12.0 test release now available. See issue: https://github.com/tennom/saxonpy/issues/6
May be worth clarifying the usage of this class, as currently after releasing the processor you cannot recreate it in the same process. A contrived example:
Running this results in the following output:
This appears to be a limitation of the of the JNI API, see https://stackoverflow.com/a/66936249