Closed tornillo closed 8 years ago
Interesting. Can't reproduce the segfault here locally. Valgrind has lots of stuff to complain, though. Will have to enable debug symbols and see which of these reports could be related to this.
Can't reproduce the segfault here locally.
The above example does not cause an segfault on your environment?
It didn't with node 4.1.1 in release mode, but it does now ith node 5.0.0 in debug mode. Not sure it's the version or build mode which is to blame. Might also be my other cpu usage, since it's likely related to asynchroneous execution. Running the thing through gdb
, it seems as if some internal xml document data structures were no longer available. Which I'd take as an indication that the document has been cleaned by garbage collection while libxslt was still using it.
The instance indicated by gdb just now was somewhere in ApplyWorker
. Looking at that, we see that there is a pointer to some docSource
which is referred to using a libxmljs::XmlDocument*
pointer. Not a JavaScript reference of some kind. If you look at the documentation for Nan::AsyncWorker
you'll notice it has a number of functions called SaveToPersistent
which simply store an object to a given key on some persistent object. Why yould one want to use that if you could as well use C++ members in a derived class? Because having the JavaScript objects stored in a persistent JavaScript object tells the V8 garbage collector that you're still using them. node-libxslt doesn't do so.
I'd say it's up to you whether you store JavaScript objects to the persistent storage and their contained native objects to additional members, or whether you use the persistent storage exclusively and unwrap objects from there during execution only. Either approach should work in my opinion, and the former should be less work in the short run. So I'd say take all the arguments to ApplyAsync
in their raw JavaScript form without any unwrapping, and simply store them persistently with the worker, in addition to the existing processing. Will open a pull request for this.
With this PR my test example is now working fine, but my real application with big xsl-files keeps throwing segfault on synthetic highload tests...
/home/t/test/node_modules/segfault-handler/build/Release/segfault-handler.node(+0x1aca)[0x7f260c060aca]
/lib64/libpthread.so.0(+0xf130)[0x7f260fe47130]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(xsltEvalAVT+0x32)[0x7f260d28c222]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(xsltAttrListTemplateProcess+0xf5)[0x7f260d2a3a55]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(+0x12123)[0x7f260d277123]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(+0x11ef6)[0x7f260d276ef6]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(+0x11ef6)[0x7f260d276ef6]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(+0x15832)[0x7f260d27a832]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(xsltProcessOneNode+0x78)[0x7f260d27af08]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(xsltApplyTemplates+0x4cc)[0x7f260d27bc4c]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(+0x11ef6)[0x7f260d276ef6]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(+0x15832)[0x7f260d27a832]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(xsltProcessOneNode+0x78)[0x7f260d27af08]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(+0x1990b)[0x7f260d27e90b]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(_ZN11ApplyWorker7ExecuteEv+0x43)[0x7f260d2754d3]
PID 5588 received SIGSEGV for address: 0x78
/home/t/test/node_modules/segfault-handler/build/Release/segfault-handler.node(+0x1aca)[0x7f260c060aca]
/lib64/libpthread.so.0(+0xf130)[0x7f260fe47130]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(xsltEvalAVT+0x32)[0x7f260d28c222]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(xsltAttrListTemplateProcess+0xf5)[0x7f260d2a3a55]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(+0x12123)[0x7f260d277123]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(+0x11ef6)[0x7f260d276ef6]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(+0x11ef6)[0x7f260d276ef6]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(+0x15832)[0x7f260d27a832]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(xsltProcessOneNode+0x78)[0x7f260d27af08]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(xsltApplyTemplates+0x4cc)[0x7f260d27bc4c]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(+0x11ef6)[0x7f260d276ef6]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(+0x15832)[0x7f260d27a832]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(xsltProcessOneNode+0x78)[0x7f260d27af08]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(+0x1990b)[0x7f260d27e90b]
/home/t/test/node_modules/libxslt/build/Release/node-libxslt.node(_ZN11ApplyWorker7ExecuteEv+0x43)[0x7f260d2754d3]
node[0xfd11a1]
node[0xfdf449]
/lib64/libpthread.so.0(+0x7df5)[0x7f260fe3fdf5]
/lib64/libc.so.6(clone+0x6d)[0x7f260fb6d1ad]
node[0xfd11a1]
node[0xfdf449]
/lib64/libpthread.so.0(+0x7df5)[0x7f260fe3fdf5]
/lib64/libc.so.6(clone+0x6d)[0x7f260fb6d1ad]
Can you try some of this:
cd node_modules/libxmljs-mt
node-gyp rebuild -d
ln -s Debug build/Release
cd -
node-gyp rebuild -d
This should enable debug symbols for libxmljs and node-libxslt. Hopefully this will give us a line number for the segfault. It would be even better if you could attach gdb
to the process, one way or another. And when the segfault occurs, see exactly what data structure is affected.
Of course, it would also be nice to have a small reproducing example, like the previous one, so I can try that out for myself.
@gagern thank you for clarification of points to debug! I'll try to attach gdb a bit later, but for now here's a new test example, which throws segfault at every run in my development environment (NodeJS 5.0.0)
require('segfault-handler').registerHandler('crash.log');
var libxslt = require('libxslt');
var xml = '<?xml version="1.0" encoding="utf-8"?>\n<root></root>';
var transform = function () {
libxslt.parseFile('./test.xsl', function (err, xslObject) {
xslObject.apply(xml, function (err, result) {});
});
}
var counter = 1e3;
while (counter--) {
transform();
}
test.xsl
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:apply-templates select="root" >
<xsl:with-param name="total">1000</xsl:with-param>
</xsl:apply-templates>
</xsl:template>
<xsl:template match="root">
<xsl:param name="current">1</xsl:param>
<xsl:param name="total" />
<xsl:value-of select="generate-id()" />
<xsl:if test="$current < $total">
<xsl:apply-templates select="../root">
<xsl:with-param name="current" select="$current + 1" />
<xsl:with-param name="total" select="$total" />
</xsl:apply-templates>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
I can reproduce this.
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffefd87700 (LWP 23189)]
0x00007ffff43a098d in xmlStrEqual (
str1=0x7475223d676e6965 <error: Cannot access memory at address 0x7475223d676e6965>,
str2=0x7fffefdeb820 "http://www.w3.org/1999/XSL/Transform")
at ../vendor/libxml/xmlstring.c:162
162 if (*str1++ != *str2) return(0);
(gdb) bt
#0 0x00007ffff43a098d in xmlStrEqual (str1=0x7475223d676e6965 <error: Cannot access memory at address 0x7475223d676e6965>, str2=0x7fffefdeb820 "http://www.w3.org/1999/XSL/Transform") at ../vendor/libxml/xmlstring.c:162
#1 0x00007fffefda6eba in xsltApplySequenceConstructor (ctxt=0x7fffd4de61b8, contextNode=0x1d81fe8, list=0x12d3138, templ=0x0) at ../deps/libxslt/libxslt/transform.c:2616
#2 0x00007fffefdad1d2 in xsltIf (ctxt=0x7fffd4de61b8, contextNode=0x1d81fe8, inst=0x12d2ec8, castedComp=0x7fffd8001a38) at ../deps/libxslt/libxslt/transform.c:5464
#3 0x00007fffefda7033 in xsltApplySequenceConstructor (ctxt=0x7fffd4de61b8, contextNode=0x1d81fe8, list=0x12cc478, templ=0x7fffd8003608) at ../deps/libxslt/libxslt/transform.c:2647
#4 0x00007fffefda80be in xsltApplyXSLTTemplate (ctxt=0x7fffd4de61b8, contextNode=0x1d81fe8, list=0x12cc478, templ=0x7fffd8003608, withParams=0x7fffd4e31ee8) at ../deps/libxslt/libxslt/transform.c:3108
#5 0x00007fffefda6888 in xsltProcessOneNode (ctxt=0x7fffd4de61b8, contextNode=0x1d81fe8, withParams=0x7fffd4e31ee8) at ../deps/libxslt/libxslt/transform.c:2097
#6 0x00007fffefdac910 in xsltApplyTemplates (ctxt=0x7fffd4de61b8, node=0x1d81fe8, inst=0x12d3138, castedComp=0x7fffd8001b98) at ../deps/libxslt/libxslt/transform.c:5141
#7 0x00007fffefda7033 in xsltApplySequenceConstructor (ctxt=0x7fffd4de61b8, contextNode=0x1d81fe8, list=0x12d3138, templ=0x0) at ../deps/libxslt/libxslt/transform.c:2647
#8 0x00007fffefdad1d2 in xsltIf (ctxt=0x7fffd4de61b8, contextNode=0x1d81fe8, inst=0x12d2ec8, castedComp=0x7fffd8001a38) at ../deps/libxslt/libxslt/transform.c:5464
#9 0x00007fffefda7033 in xsltApplySequenceConstructor (ctxt=0x7fffd4de61b8, contextNode=0x1d81fe8, list=0x12cc478, templ=0x7fffd8003608) at ../deps/libxslt/libxslt/transform.c:2647
#10 0x00007fffefda80be in xsltApplyXSLTTemplate (ctxt=0x7fffd4de61b8, contextNode=0x1d81fe8, list=0x12cc478, templ=0x7fffd8003608, withParams=0x7fffd4e31c38) at ../deps/libxslt/libxslt/transform.c:3108
#11 0x00007fffefda6888 in xsltProcessOneNode (ctxt=0x7fffd4de61b8, contextNode=0x1d81fe8, withParams=0x7fffd4e31c38) at ../deps/libxslt/libxslt/transform.c:2097
#12 0x00007fffefdac910 in xsltApplyTemplates (ctxt=0x7fffd4de61b8, node=0x1d81fe8, inst=0x12d3138, castedComp=0x7fffd8001b98) at ../deps/libxslt/libxslt/transform.c:5141
#13 0x00007fffefda7033 in xsltApplySequenceConstructor (ctxt=0x7fffd4de61b8, contextNode=0x1d81fe8, list=0x12d3138, templ=0x0) at ../deps/libxslt/libxslt/transform.c:2647
#14 0x00007fffefdad1d2 in xsltIf (ctxt=0x7fffd4de61b8, contextNode=0x1d81fe8, inst=0x12d2ec8, castedComp=0x7fffd8001a38) at ../deps/libxslt/libxslt/transform.c:5464
#15 0x00007fffefda7033 in xsltApplySequenceConstructor (ctxt=0x7fffd4de61b8, contextNode=0x1d81fe8, list=0x12cc478, templ=0x7fffd8003608)
This is part of a 1500+ frame stack trace. The depth of which might be contributing to the problem in some way I don't understand yet. Frame 2 happens at transform.c:2616
inside the expansion of IS_XSLT_ELEM
. So I'd say the problem occurs while the application is looking at the XSLT document, not the XML input. For some reason, some part of it apparently has disappeared – again. Needs more investigation, but I guess as one of my next steps I'd try to generate some output when the XML document gets freed, to see whether any document gets completely freed even though it's still in use.
Not resolved, as indicated by https://github.com/albanm/node-libxslt/issues/28#issuecomment-154877818 and subsequent comments.
I'd guess that the remaining issue is because the stylesheet refers to the xml document from which it was constructed, but on the v8 level the stylesheet object doesn't preserve a reference to the xml document object. Note the comment in the ~Stylesheet
destructor:
// We can't free the stylesheet as the xml doc inside was probably // already deleted by garbage collector and this results in segfaults
Well, that problem isn't restricted to the destructor alone; with enough memory pressure, the xml document will be gone by the time the stylesheet is used. I tried adding a Nan::Persistent
reference from the stylesheet to the wrapper for the xml document. Doesn't work yet, still investigating why.
@tornillo, would you please give #31 a try? On your small example it works fine.
@gagern With #31 highload test was successfully passed without segfaults on my project. Wow! Thank you!:)
@tornillo A pleasure. Looking forward to the next release here.
Hi!
I have a project with a complex structure of xsl and xml data. Production environment is not highload (less than 10 rps). Segfault occurs for every 300-1000 request, but 100% when I try to run synthetic performance tests.
To reproduce this error, i've created a isolated example, in which the error occurs on different versions of Node (0.12, 4, 5), but more often on 4-5. (libxslt is compiled from master to work with Node 4-5)
Error logs are different from run to run.
Log1:
Log2:
@gagern may be you can help with this strange error?:)