Closed akshunj closed 5 years ago
On 06/23/2017 01:57 PM, akshunj wrote:
After pulling the latest CentOS kernel through yum (2.6.32-696.3.2.el6.x86_64) pljava crashes on any attempt to use it. I am able to deploy various versions of pljava including the latest 1.6.0-snapshot without any problem. I tried rebuilding against the latest updates to postgres and java, but the issue persists. I am wondering if anyone else has observed this behavior? If I roll back to the previous kernel the issue goes away.
Hi,
That's the first I've heard of it. I take it this is an oldish CentOS release, to be using the 2.6.32 kernel?
Usually when Java crashes, somewhere in the crash message it will
give you the name and path of a file, something like
hs_err_pid
Would you be willing to attach that file (after first skimming through it to be sure nothing sensitive from your operation shows up in the data it includes)?
It sure sounds like something went sideways in preparing the -696.3.2 kernel changes, but the hs_err might give more info on exactly what.
Or, have you checked whether yum also has an update to your Java, that might work with whatever was changed in the kernel?
-Chap
Hi Chap,
The hs_err is attached to the original post. I am using the latest JDK from Oracle. (not OpenJDK)
On 06/23/2017 02:21 PM, akshunj wrote:
The hs_err is attached to the original post.
My apologies. I replied to an email notification of the post, which didn't include the attachment.
-Chap
Oh, my bad did not realize.
It certainly seems as if the new kernel changed something, but at the moment I've no clear idea what. I might try a few changes of things that can be changed, just to see if there is any way of not triggering whatever happens in this new kernel. Probably these will not change anything, but I might try just to see.
How about starting PL/Java in a fresh session after doing an explicit
SET pljava.vmoptions TO '';
Looking at the VM options that are set, I get the impression they may have been set from some time ago and earlier Java or PL/Java versions. I would be interested to see if starting over fresh with empty pljava.vmoptions
will make any difference.
I am also curious about all of the Java-related directories added both to PATH
and to LD_LIBRARY_PATH
. Are those there because something else in your environment needs them? PL/Java doesn't. I wonder what would happen in a new session with a plain vanilla PATH
and no LD_LIBRARY_PATH
. Of course that is more disruptive than just SET pljava.vmoptions
in a new session; the usual way would be to stop postgres and restart it after setting a vanilla PATH
and unsetting LD_LIBRARY_PATH
.
Come to think of it, there might be a nondisruptive way. Start a new session, use something like PL/Perl to unset PATH
and LD_LIBRARY_PATH
in that backend's environment, then call a PL/Java function.
Again, not expecting either attempt to work a miracle, but might gather some information.
By the way, what is returned by
\sf sqlj.java_call_handler
?
-Chap
... the actual signal being raised:
si_signo: 7 (SIGBUS), si_code: 2 (BUS_ADRERR), si_addr: 0x00007ffd145b3f80
has caught an attempt to access memory in an unmapped region just south of the stack:
7ffd144bd000-7ffd144c0000 ---p 00000000 00:00 0
7ffd145c0000-7ffd145bd000 rw-p 00000000 00:00 0 [stack]
7ffd145df000-7ffd145e0000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Vendors have been recently hardening kernels against possible attacks that involve accesses near the gap between stack and other stuff, so the new kernel may well have rearranged some furniture in that area. The surprise is that Java would be doing something that the new arrangement would trip up. Have you checked for a recent newer build of Oracle JDK?
Found this: https://access.redhat.com/solutions/3091371
Chap, thanks I'll have to check this out!
On Jun 24, 2017 10:13 AM, "Chapman Flack" notifications@github.com wrote:
Found this: https://access.redhat.com/solutions/3091371
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tada/pljava/issues/128#issuecomment-310840906, or mute the thread https://github.com/notifications/unsubscribe-auth/AcSizyO20QCbvGhVBKrgC2kSF2fOoRbaks5sHRlsgaJpZM4OD3h9 .
I can't access the Red Hat solutions link, but other reports of the issue online suggest adding -Xss2M
(or larger) to the VM options to make the per-thread stack at least 2 MB in size. What the new kernel does apparently is to increase the size of the "Stack Guard" region below the stack in a way that Java blunders into if the initial stack size isn't big enough.
According to the docs, this option is a hard stack size setting, not a minimum. Not only will the stack begin that size, it also can't grow. So if whatever PL/Java is being used for might require more than 2 MB of stack, the option may need to be increased further to avoid stack overflow errors.
I assume this is an interim solution, and Oracle will eventually release a Java update that doesn't blunder into the stack guard, and then -Xss
won't have to be explicitly set.
In your specific case, I would still be interested in the output from
\sf sqlj.java_call_handler
and in trying to simplify your pljava.vmoptions
settings ... maybe starting with a simple
SET pljava.vmoptions TO '-Xss2M';
seeing if that works, then maybe adding back other tuning options as you need them, referring to the PL/Java VM options page for ideas. Turning on class data sharing is likely a win.
Thanks Chap, the link to the RHEL article you posted does indeed fix the problem. I guess we'll have to wait for a permanent fix in the JDK.
Thanks for the confirmation. If you were able to see the whole RHEL article, did it have any other suggestions, or just the -Xss2M
option I saw in other posts I was able to read?
What does your
\sf sqlj.java_call_handler
say, by the way?
Yes I'll paste the article in a follow up reply. I didn't try \sf but I can try later.
On Jun 24, 2017 12:22 PM, "Chapman Flack" notifications@github.com wrote:
Thanks for the confirmation. If you were able to see the whole RHEL article, did it have any other suggestions, or just the -Xss2M option I saw in other posts I was able to read?
What does your
\sf sqlj.java_call_handler
say, by the way?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tada/pljava/issues/128#issuecomment-310848498, or mute the thread https://github.com/notifications/unsubscribe-auth/AcSiz5U59Q6-mQlBJT2fLlfuoWc4Jb4Mks5sHTfNgaJpZM4OD3h9 .
I'm not sure how this cut and paste will look on the mailing list, so my apologies if it's rubbish:
JVM crashes after updating to kernel with patch for Stack Guard flaw. SOLUTION UNVERIFIED - Updated Yesterday at 11:18 AM - English https://access.redhat.com/solutions/3091371 Environment
Issue
Raw https://access.redhat.com/solutions/3091371#
#
#
Resolution
The current workaround is to increase the Thread stack size of the JVM using -Xss2m. This will require you to restart the JVM.
Research is being performed on a permanent solution.
Product(s)
Red Hat Enterprise Linux https://access.redhat.com/taxonomy/products/red-hat-enterprise-linux
Component
tomcat7 https://access.redhat.com/taxonomy/components/tomcat7
tomcat8 https://access.redhat.com/taxonomy/components/tomcat8
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
On Jun 24, 2017 12:25 PM, "Rick Jackson" rickjackson001@gmail.com wrote:
Yes I'll paste the article in a follow up reply. I didn't try \sf but I can try later.
On Jun 24, 2017 12:22 PM, "Chapman Flack" notifications@github.com wrote:
Thanks for the confirmation. If you were able to see the whole RHEL article, did it have any other suggestions, or just the -Xss2M option I saw in other posts I was able to read?
What does your
\sf sqlj.java_call_handler
say, by the way?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tada/pljava/issues/128#issuecomment-310848498, or mute the thread https://github.com/notifications/unsubscribe-auth/AcSiz5U59Q6-mQlBJT2fLlfuoWc4Jb4Mks5sHTfNgaJpZM4OD3h9 .
Hi Chap,
I ran \sf sqlj.java_call_handler and get the following output:
myPgDb01=# \sf sqlj.java_call_handler CREATE OR REPLACE FUNCTION sqlj.java_call_handler() RETURNS language_handler LANGUAGE c AS 'pljava', $function$java_call_handler$function$
Hi,
I had a sneaking suspicion. It appears that, while you reported building/installing several different PL/Java versions, the configuration you've got inside PostgreSQL is somehow partially updated and doesn't reflect that. Starting with 1.5.0, the dynamic library has been named with a version, so the output would look something like this in the expected case:
CREATE OR REPLACE FUNCTION sqlj.java_call_handler()
RETURNS language_handler
LANGUAGE c
AS 'libpljava-so-1.5.1-BETA1', $function$java_call_handler$function$
Have you been using CREATE EXTENSION
, or an older, pre-9.1 installation approach? What does
\dx pljava
say? For that matter, what does
SELECT * FROM pg_available_extension_versions WHERE name = 'pljava';
say?
I think at the moment this particular DB is using an older version ala deploy.jar method. I used the create extension method earlier:
MyPgDb01=# \dx pljava List of installed extensions Name | Version | Schema | Description ------+---------+--------+------------- (0 rows)
MyPgDb01=# SELECT * FROM pg_available_extension_versions WHERE name = 'pljava'; name | version | installed | superuser | relocatable | schema | requires | comment --------+----------------+-----------+-----------+-------------+--------+----------+-------------------------------------------------------------- pljava | 1.6.0-SNAPSHOT | f | t | f | sqlj | | PL/Java procedural language (https://tada.github.io/pljava/) (1 row)
On Sun, Jun 25, 2017 at 8:49 PM, Chapman Flack notifications@github.com wrote:
Hi,
I had a sneaking suspicion. It appears that, while you reported building/installing several different PL/Java versions, the configuration you've got inside PostgreSQL is somehow partially updated and doesn't reflect that. Starting with 1.5.0, the dynamic library has been named with a version, so the output would look something like this in the expected case:
CREATE OR REPLACE FUNCTION sqlj.java_call_handler() RETURNS language_handler LANGUAGE c AS 'libpljava-so-1.5.1-BETA1', $function$java_call_handler$function$
Have you been using CREATE EXTENSION, or an older, pre-9.1 installation approach? What does
\dx pljava
say? For that matter, what does
SELECT * FROM pg_available_extension_versions WHERE name = 'pljava';
say?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tada/pljava/issues/128#issuecomment-310939964, or mute the thread https://github.com/notifications/unsubscribe-auth/AcSizzlHnNs1YacxzhevRpTcQs-CHqZpks5sHwAUgaJpZM4OD3h9 .
Got it. So, I'd suggest going to a nice extension installation of either 1.5.0 (if you like a stable, final release) or 1.5.1-BETA1 (if you like beta testing). You should be able to just build your choice of those, run the self-installer jar, and see it show up in pg_available_extension_versions
, and then (making sure you're in a fresh session where no PL/Java code has run yet), run
CREATE EXTENSION pljava VERSION '1.5.0' FROM unpackaged;
(or '1.5.1-BETA1' if you prefer), and it should preserve all your existing PL/Java stuff and bring it all to a consistent, extension-packaged, released version. You should be able to see with \dx
and \sf
that it happened.
After that's done, it should be possible to look at pruning those Java-related entries in PATH
and LD_LIBRARY_PATH
that I suspect are vestiges of your old Deployer
installation; current PL/Java works without them. (But maybe they are there for something else you're using.)
-Chap
Spot on, used to need that library path to do the old make install back in the day. It's survived hundreds of vm template builds :)
On Jun 25, 2017 9:44 PM, "Chapman Flack" notifications@github.com wrote:
Got it. So, I'd suggest going to a nice extension installation of either 1.5.0 (if you like a stable, final release) or 1.5.1-BETA1 (if you like beta testing). You should be able to just build your choice of those, run the self-installer jar, and see it show up in pg_availableextension versions, and then (making sure you're in a fresh session where no PL/Java code has run yet), run
CREATE EXTENSION pljava VERSION '1.5.0' FROM unpackaged;
(or '1.5.1-BETA1' if you prefer), and it should preserve all your existing PL/Java stuff and bring it all to a consistent, extension-packaged, released version. You should be able to see with \dx and \sf that it happened.
After that's done, it should be possible to look at pruning those Java-related entries in PATH and LD_LIBRARY_PATH that I suspect are vestiges of your old Deployer installation; current PL/Java works without them. (But maybe they are there for something else you're using.)
-Chap
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tada/pljava/issues/128#issuecomment-310944348, or mute the thread https://github.com/notifications/unsubscribe-auth/AcSiz0LqzNKbXs901FBDhOx7o5_OBV1Oks5sHwz-gaJpZM4OD3h9 .
Java builds that do not require the -Xss hack have been out for many months now. Closing.
hs_err_pid2820.zip Hi,
After pulling the latest CentOS kernel through yum (2.6.32-696.3.2.el6.x86_64) pljava crashes on any attempt to use it. I am able to deploy various versions of pljava including the latest 1.6.0-snapshot without any problem. I tried rebuilding against the latest updates to postgres and java, but the issue persists. I am wondering if anyone else has observed this behavior? If I roll back to the previous kernel the issue goes away. I attached the output from the crash.
Thanks.