Closed packetmonkey closed 13 years ago
Would it be helpful if I posted the backtraces from the other ruby implementations?
Greetings!
Thanks very much for reporting this issue. Here's what we need to make it actionable:
nokogiri -v
(which will tell us what version of libxml you're running, among other things)If we can't reproduce it, we can't fix it! Thanks.
Here is the nokogiri -v information. Unfortunately the data set I'm currently using is a 7.2 gig xml export of a production database that I can't readily share (not to mention it's huge). I'll try to narrow down what i'm doing in the script to trigger the action and possibly try to hunt down the data in the xml that's causing it from that side.
---
warnings: []
nokogiri: 1.5.0.beta.4
libxml:
loaded: 2.7.3
binding: extension
compiled: 2.7.3
ruby:
platform: x86_64-apple-darwin10.6.0
version: 1.8.7
engine: rbx
Thanks. If reproducing with less data proves difficult, it would probably be useful to run your code under valgrind, and send us the output. I like to do:
valgrind --partial-loads-ok=yes --undef-value-errors=no ruby your-script.rb 2>&1 | tee valgrind.log
Also, feel free to contact me off-the-record if you need help. mike.dalessio@gmail.com
Here is the valgrind output. I used matz 1.9.2 because it had less noise than rubinius.
==11129== Memcheck, a memory error detector
==11129== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==11129== Using Valgrind-3.6.0 and LibVEX; rerun with -h for copyright info
==11129== Command: ruby measuring_cup_load.rb /Users/evansharp/Desktop/Storefront_20110307_DB.xml
==11129==
--11129-- /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/bin/ruby:
--11129-- dSYM directory is missing; consider using --dsymutil=yes
--11129-- /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib:
--11129-- dSYM directory is missing; consider using --dsymutil=yes
==11129== Warning: ignored attempt to set SIGUSR2 handler in sigaction();
==11129== the SIGUSR2 signal is used internally by Valgrind
UNKNOWN __pthread_sigmask is unsupported. This warning will not be repeated.
--11129-- /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/x86_64-darwin10.6.0/enc/encdb.bundle:
--11129-- dSYM directory is missing; consider using --dsymutil=yes
--11129-- /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/x86_64-darwin10.6.0/enc/trans/transdb.bundle:
--11129-- dSYM directory is missing; consider using --dsymutil=yes
--11129-- /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/x86_64-darwin10.6.0/etc.bundle:
--11129-- dSYM directory is missing; consider using --dsymutil=yes
--11129-- /Users/evansharp/.rvm/gems/ruby-1.9.2-p136/gems/nokogiri-1.4.4/lib/nokogiri/nokogiri.bundle:
--11129-- dSYM directory is missing; consider using --dsymutil=yes
--11129-- /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/x86_64-darwin10.6.0/stringio.bundle:
--11129-- dSYM directory is missing; consider using --dsymutil=yes
--11129-- /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/x86_64-darwin10.6.0/racc/cparse.bundle:
--11129-- dSYM directory is missing; consider using --dsymutil=yes
--11129-- /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/x86_64-darwin10.6.0/strscan.bundle:
--11129-- dSYM directory is missing; consider using --dsymutil=yes
**Notice: C extension not loaded. This is required for optimum MongoDB Ruby driver performance.
You can install the extension as follows:
gem install bson_ext
If you continue to receive this message after installing, make sure that the
bson_ext gem is in your load path and that the bson_ext and mongo gems are of the same version.
--11129-- /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/x86_64-darwin10.6.0/socket.bundle:
--11129-- dSYM directory is missing; consider using --dsymutil=yes
--11129-- /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/x86_64-darwin10.6.0/digest/md5.bundle:
--11129-- dSYM directory is missing; consider using --dsymutil=yes
--11129-- /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/x86_64-darwin10.6.0/digest.bundle:
--11129-- dSYM directory is missing; consider using --dsymutil=yes
==11129== Invalid read of size 1
==11129== at 0x10012869A: st_lookup (in /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib)
==11129== by 0x10018DF60: vm_exec_core (in /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib)
==11129== by 0x100190AE2: vm_exec (in /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib)
==11129== by 0x10019DF8F: rb_yield (in /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib)
==11129== by 0x10002ADC1: rb_ary_each (in /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib)
==11129== by 0x10019CC92: vm_call_method (in /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib)
==11129== by 0x100188E43: vm_exec_core (in /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib)
==11129== by 0x100190AE2: vm_exec (in /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib)
==11129== by 0x100191E5C: rb_vm_invoke_proc (in /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib)
==11129== by 0x10019CC92: vm_call_method (in /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib)
==11129== by 0x100188E43: vm_exec_core (in /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib)
==11129== by 0x100190AE2: vm_exec (in /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib)
==11129== Address 0x10 is not stack'd, malloc'd or (recently) free'd
==11129==
==11129==
==11129== Process terminating with default action of signal 11 (SIGSEGV)
==11129== General Protection Fault
==11129== at 0x1003E4299: dyld_stub_binder (in /usr/lib/libSystem.B.dylib)
==11129== by 0x10023300F: ??? (in /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib)
==11129== by 0x13804E323: ???
==11129== by 0x10018DF60: vm_exec_core (in /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib)
==11129== by 0x100190AE2: vm_exec (in /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib)
==11129== by 0x10019DF8F: rb_yield (in /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib)
==11129== by 0x10002ADC1: rb_ary_each (in /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib)
==11129== by 0x10019CC92: vm_call_method (in /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib)
==11129== by 0x100188E43: vm_exec_core (in /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib)
==11129== by 0x100190AE2: vm_exec (in /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib)
==11129== by 0x100191E5C: rb_vm_invoke_proc (in /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib)
==11129== by 0x10019CC92: vm_call_method (in /Users/evansharp/.rvm/rubies/ruby-1.9.2-p136/lib/libruby.1.9.1.dylib)
--11129:0:schedule VG_(sema_down): read returned -4
==11129==
==11129== HEAP SUMMARY:
==11129== in use at exit: 9,947,158 bytes in 53,831 blocks
==11129== total heap usage: 241,683 allocs, 187,852 frees, 33,966,073 bytes allocated
==11129==
==11129== LEAK SUMMARY:
==11129== definitely lost: 20 bytes in 2 blocks
==11129== indirectly lost: 0 bytes in 0 blocks
==11129== possibly lost: 0 bytes in 0 blocks
==11129== still reachable: 9,947,138 bytes in 53,829 blocks
==11129== suppressed: 0 bytes in 0 blocks
==11129== Rerun with --leak-check=full to see details of leaked memory
==11129==
==11129== For counts of detected and suppressed errors, rerun with: -v
==11129== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
After a bit of work I have managed to narrow down where in the XML file slightly. The first round of tests showed the segfault occurring at the 56909th user record which is a little over 57k lines into this 20794220 line xml file. However a later test on an almost functionally identical script had the crash occurring after user 56121.
Changes like the ones I show below also seem to change where the script crashes. In this case, allowing the script to parse the full XML file (although in all fairness I don't do any work with what is parsed later in the file in this test script).
reader = Nokogiri::XML::Reader File.new('../Storefront_20110307_DB.xml', 'r')
reader.each do |node|
import_identity(node) and next if node.name == 'PFWeb::Identities__Row'
end
Changes to
reader = Nokogiri::XML::Reader File.new('../Storefront_20110307_DB.xml', 'r')
reader.each_with_index do |node,i|
puts "Line: #{i+1}"
import_identity(node) and next if node.name == 'PFWeb::Identities__Row'
end
Where import_identity is the code I use to parse and import that node, which hasn't changed during this revision. Commenting out the `puts "Line: #{i+1}" line causes the script to once again crash after user 56121.
Interesting isn't it? :)
I am however at a bit of a loss on how to continue to narrow down the problem as it doesn't appear to be an error in the actual XML parser so much as whatever internal memory management is done inside the compiled extension. How can I continue to help get this resolved?
Can you please provide valgrind output from using 1.8.7 on a Linux machine? Preferably using a debug build of ruby, but that's not absolutely necessary.
Again, useful valgrind options to reduce the "noisiness" of the output are:
valgrind --partial-loads-ok=yes --undef-value-errors=no ruby your-script.rb
This was run using ruby 1.8.7 (2010-01-10 patchlevel 249) [x86_64-linux] on Ubuntu 10.04.1
==6859== Memcheck, a memory error detector
==6859== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==6859== Using Valgrind-3.6.0.SVN-Debian and LibVEX; rerun with -h for copyright info
==6859== Command: ruby nokoseg.rb
==6859==
==6859== Invalid read of size 8
==6859== at 0x4EC8E23: st_lookup (st.c:250)
==6859== by 0x4E5FEDE: search_method (eval.c:486)
==6859== by 0x4E5FF66: rb_get_method_body (eval.c:507)
==6859== by 0x4E6B789: rb_call (eval.c:6150)
==6859== by 0x4E68690: rb_eval (eval.c:3506)
==6859== by 0x4E69DE8: rb_yield_0 (eval.c:5095)
==6859== by 0x4E483F0: rb_ary_each (array.c:1261)
==6859== by 0x4E6B663: rb_call0 (eval.c:5928)
==6859== by 0x4E6B852: rb_call (eval.c:6176)
==6859== by 0x4E68690: rb_eval (eval.c:3506)
==6859== by 0x4E68FD5: rb_eval (eval.c:3236)
==6859== by 0x4E6B512: rb_call0 (eval.c:6079)
==6859== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==6859==
/var/lib/gems/1.8/gems/nokogiri-1.4.4/lib/nokogiri/xml/reader.rb:99: [BUG] Segmentation fault
ruby 1.8.7 (2010-01-10 patchlevel 249) [x86_64-linux]
==6859==
==6859== HEAP SUMMARY:
==6859== in use at exit: 5,964,486 bytes in 28,813 blocks
==6859== total heap usage: 132,707 allocs, 103,894 frees, 20,165,464 bytes allocated
==6859==
==6859== LEAK SUMMARY:
==6859== definitely lost: 60 bytes in 1 blocks
==6859== indirectly lost: 240 bytes in 10 blocks
==6859== possibly lost: 4,232 bytes in 52 blocks
==6859== still reachable: 5,959,954 bytes in 28,750 blocks
==6859== suppressed: 0 bytes in 0 blocks
==6859== Rerun with --leak-check=full to see details of leaked memory
==6859==
==6859== For counts of detected and suppressed errors, rerun with: -v
==6859== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
On a whim I also toyed a bit with the ruby garbage collector and memory allocation settings. I don't think I was able to have much of an effect on the problem but then agan I was just throwing numbers at various settings to see what effect it had.
An additional note, parsing the same document using a Nokogiri::XML::SAX::Document with Nokogiri::XML::SAX::Parser does not seem to cause a segfault. This leads me to think the problem isn't with the handling of the XML itself, but internal memory management inside the Reader class.
If you want us to be able to fix this issue, we need to be able to reproduce it. Can you help us reproduce it?
Anything I can do to help. It looks like the issue happens when the system runs out of free memory, so it could be an edge case when ruby can't allocate more memory. To test this I am going to create a VM with lower ram and try using a smaller XML file. I am fairly certain it has nothing to do with actually parsing the XML using nokogiri as mentioned in my previous comment.
Yes! You can help us by providing your code and helping us reproduce this issue. :)
If Reader is leaking memory, then I'd like to try to fix that. Or, if you're doing something in your code that uses the Reader node that's causing this, then I can probably tell you a workaround or fix the underlying issue.
Your code in previous comments calls a method import_identity
. Can you show us that code? And can you share your XML, or a representative sample, or an obfuscated sample with the same structure?
If your code or data are sensitive, then there are other options, including contacting me off-list.
I'd love to help you. Please help me help you. :)
I have managed to reduce the content and script to something reproducible. The segmentation fault does seem to trigger on machines with lower ram faster. I would run this script in a vm with a low amount of ram (256 megs), and it would crash. I would then run the exact same VM with more ram (512 megs or 1 gig) and it would still crash but at a much later time. Below are the steps to generate the XML file and the script which will induce the crash.
execute the following script as
$ ruby xmlgen.rb > test.xml
This script will generate a 322 meg XML file which mimics the schema used by the original database dump which first revealed the issue.
puts "<?xml version=\"1.0\"?>"
puts "<PFWeb:Database xmlns:PFWeb=\"http://www.pageflex.com/schemas/2004/Storefront/Database/20071115\" build-number=\"6.1.2.919\">"
puts " <PFWeb:Identities__Table>"
500000.times do |i|
puts " <PFWeb:Identities__Row PFWeb:IdentityID__ID=\"Identities_#{i}\" PFWeb:b_IsGroup=\"False\" PFWeb:b_GroupDisplaysAsList=\"False\" PFWeb:b_HasAccessToUserSite=\"True\" PFWeb:b_HasAccessToAdminSite=\"False\" PFWeb:b_IsActive=\"True\" PFWeb:b_IsArchived=\"True\" PFWeb:b_IsAnonymous=\"False\" PFWeb:b_PendingSelfRegistration=\"False\" PFWeb:LogonName__STR=\"dummy@example \" PFWeb:PasswordHashed__STR=\"111111111 \" PFWeb:DirectoryName__STR=\"dummy@example\" PFWeb:WhenCreated__ISO8601=\"2009-03-12T13:51:23\" PFWeb:WhenLastLoggedIn__ISO8601=\"2009-03-16T16:56:18\" />"
end
puts " </PFWeb:Identities__Table>"
puts "</PFWeb:Database>"
Once you have the test.xml file generated, execute this script to parse and output some data from that XML file. This managed to even cause a segfault on my Macbook Pro with 4 gigs of ram after roughly 8 minutes, so I would assume it will cause a crash on just about any machine you run it on, however as I said it will crash faster inside a VM with very limited RAM.
require 'rubygems'
require 'nokogiri'
require 'ap'
reader = Nokogiri::XML::Reader File.new('test.xml')
reader.each do |node|
tmp = Hash.new
tmp['name'] = node.name
tmp['id'] = node.attributes['IdentityID_ID']
tmp['username'] = node.attributes['LogonName__STR']
ap tmp
end
Please let me know if this managed to crash on your system so I know you can reproduce it as well. If not I will continue to try and provide less fragile test case but this version appears to be consistant (for me).
Hope this helps!
I've reproduced what appears to be your crash! Thanks for your help. I'll let you know what I dig up.
This appears to be related to the issue being worked around in https://github.com/tenderlove/nokogiri/issues/95
There is nothing quite as fun as rediscovering a 1.5 year old bug :) Is there anything else I can do to help you narrow down what is going on?
Ha! Well, this isn't a regression, it's just related. The core issue is that libxml2 frees the underlying node struct out from under the ruby object (which retains a pointer to the C struct). There's nothing we can do to avoid the dangling pointer, so we're left with trying to work around dereferencing it.
I've got an idea I should be able to throw up on a branch tonight.
On Wed, Apr 20, 2011 at 12:31 PM, packetmonkey < reply@reply.github.com>wrote:
There is nothing quite as fun as rediscovering a 1.5 year old bug :) Is there anything else I can do to help you narrow down what is going on?
Reply to this email directly or view it on GitHub: https://github.com/tenderlove/nokogiri/issues/439#comment_1033730
The previous commit (on branch 439-fix-reader-node-marking) partly fixes your issue, in that the GC phase will no longer try to follow the dangling node pointer.
However, there's a bigger issue here around string references. Your example is saving a reference to node.name
, which is a ruby string object wrapped around the libxml2 c-string pointer for the node. As the reader proceeds through the document, this c-string gets freed and the ruby string object is no longer valid, and will explode when it is GCed.
Need to talk to @tenderlove about a strategy here. Option 1 is for Nokogiri to strdup every string that gets returned from a reader node (and think very hard about related objects, such as attributes). Option 2 is to tell people (like yourself :)) to dup anything you're keeping a reference to, lest you segfault.
Option 1 is really what we should do, but it's going to be a bit of work.
Temporary workaround: dup the strings you're keeping around from the reader pass. I'm able to get your example to run cleanly by duping only node.name
.
I will give that a shot and confirm it works for me.
@packetmonkey - did you confirm that calling #dup on your Reader-returned strings addressed this issue?
Whoa, sorry. I thought I had updated this ticket, my bad.
Yes duping the strings out of reader allowed my scripts to finish executing, even on the original parser with the 7 GB export as well as my contrived scripts that seemed to reproduce the issue.
I am running into this same issue, and I can reproduce the segfault even if I dup the strings returned from the reader. @packetmonkey, I trimmed down your example a bit and turned it into the following unit test that I dropped into the test directory. Notice that I have added String#dup calls on all three of the strings coming out of the reader.
# -*- coding: utf-8 -*-
require "helper"
class TestReaderSegfault < Nokogiri::TestCase
def test_crash
xml = "<?xml version=\"1.0\"?>\n"
xml += "<Database>\n"
xml += " <Identities>\n"
100.times do |i|
xml += " <Row IdentityID=\"Identities_#{i}\" IsGroup=\"False\" GroupDisplaysAsList=\"False\" HasAccessToUserSite=\"True\" HasAccessToAdminSite=\"False\" IsActive=\"True\" IsArchived=\"True\" IsAnonymous=\"False\" PendingSelfRegistration=\"False\" LogonName=\"dummy@example\" PasswordHashed=\"111111111\" DirectoryName=\"dummy@example\" WhenCreated=\"2009-03-12T13:51:23\" WhenLastLoggedIn=\"2009-03-16T16:56:18\" />\n"
end
xml += " </Identities>\n"
xml += "</Database>\n"
100.times do |ii|
reader = Nokogiri::XML::Reader.from_memory(xml)
while reader.read
if reader.node_type == Nokogiri::XML::Reader::TYPE_ELEMENT
tmp = Hash.new
tmp['name'] = reader.name.dup
tmp['id'] = reader.attributes['IdentityID'] ? reader.attributes['IdentityID'].dup : nil
tmp['username'] = reader.attributes['LogonName'] ? reader.attributes['LogonName'].dup : nil
end
end
end
end
end
This consistently segfaults for me in master, the 439-fix-reader-node-marking branch, and the v1.4.4 tag. Here is an example segfault from when I was running in the 439-fix-reader-node-marking branch.
/home/nkriege/src/nokogiri/lib/nokogiri/xml/reader.rb:99: [BUG] Segmentation fault
ruby 1.9.2p180 (2011-02-18 revision 30909) [x86_64-linux]
-- control frame ----------
c:0021 p:0014 s:0080 b:0077 l:000068 d:000076 BLOCK /home/nkriege/src/nokogiri/lib/nokogiri/xml/reader.rb:99
c:0020 p:---- s:0074 b:0074 l:000073 d:000073 FINISH
c:0019 p:---- s:0072 b:0072 l:000071 d:000071 CFUNC :each
c:0018 p:0023 s:0069 b:0069 l:000068 d:000068 METHOD /home/nkriege/src/nokogiri/lib/nokogiri/xml/reader.rb:99
c:0017 p:0018 s:0065 b:0064 l:000063 d:000063 METHOD /home/nkriege/src/nokogiri/lib/nokogiri/xml/reader.rb:90
c:0016 p:0139 s:0061 b:0059 l:000048 d:000058 BLOCK /home/nkriege/src/nokogiri/test/test_reader_segfault.rb:22
c:0015 p:---- s:0054 b:0054 l:000053 d:000053 FINISH
c:0014 p:---- s:0052 b:0052 l:000051 d:000051 CFUNC :times
c:0013 p:0069 s:0049 b:0049 l:000048 d:000048 METHOD /home/nkriege/src/nokogiri/test/test_reader_segfault.rb:15
c:0012 p:0063 s:0045 b:0045 l:000044 d:000044 METHOD /opt/ruby-1.9.2/lib/ruby/1.9.1/minitest/unit.rb:695
c:0011 p:0091 s:0039 b:0039 l:000020 d:000038 BLOCK /opt/ruby-1.9.2/lib/ruby/1.9.1/minitest/unit.rb:656
c:0010 p:---- s:0034 b:0034 l:000033 d:000033 FINISH
c:0009 p:---- s:0032 b:0032 l:000031 d:000031 CFUNC :each
c:0008 p:0026 s:0029 b:0029 l:000020 d:000028 BLOCK /opt/ruby-1.9.2/lib/ruby/1.9.1/minitest/unit.rb:650
c:0007 p:---- s:0026 b:0026 l:000025 d:000025 FINISH
c:0006 p:---- s:0024 b:0024 l:000023 d:000023 CFUNC :each
c:0005 p:0082 s:0021 b:0021 l:000020 d:000020 METHOD /opt/ruby-1.9.2/lib/ruby/1.9.1/minitest/unit.rb:649
c:0004 p:0188 s:0016 b:0016 l:000015 d:000015 METHOD /opt/ruby-1.9.2/lib/ruby/1.9.1/minitest/unit.rb:609
c:0003 p:0057 s:0007 b:0007 l:000f88 d:000860 BLOCK /opt/ruby-1.9.2/lib/ruby/1.9.1/minitest/unit.rb:508
c:0002 p:---- s:0004 b:0004 l:000003 d:000003 FINISH
c:0001 p:0000 s:0002 b:0002 l:0010f8 d:0010f8 TOP
---------------------------
-- Ruby level backtrace information ----------------------------------------
/opt/ruby-1.9.2/lib/ruby/1.9.1/minitest/unit.rb:508:in `block in autorun'
/opt/ruby-1.9.2/lib/ruby/1.9.1/minitest/unit.rb:609:in `run'
/opt/ruby-1.9.2/lib/ruby/1.9.1/minitest/unit.rb:649:in `run_test_suites'
/opt/ruby-1.9.2/lib/ruby/1.9.1/minitest/unit.rb:649:in `each'
/opt/ruby-1.9.2/lib/ruby/1.9.1/minitest/unit.rb:650:in `block in run_test_suites'
/opt/ruby-1.9.2/lib/ruby/1.9.1/minitest/unit.rb:650:in `each'
/opt/ruby-1.9.2/lib/ruby/1.9.1/minitest/unit.rb:656:in `block (2 levels) in run_test_suites'
/opt/ruby-1.9.2/lib/ruby/1.9.1/minitest/unit.rb:695:in `run'
/home/nkriege/src/nokogiri/test/test_reader_segfault.rb:15:in `test_crash'
/home/nkriege/src/nokogiri/test/test_reader_segfault.rb:15:in `times'
/home/nkriege/src/nokogiri/test/test_reader_segfault.rb:22:in `block in test_crash'
/home/nkriege/src/nokogiri/lib/nokogiri/xml/reader.rb:90:in `attributes'
/home/nkriege/src/nokogiri/lib/nokogiri/xml/reader.rb:99:in `attribute_nodes'
/home/nkriege/src/nokogiri/lib/nokogiri/xml/reader.rb:99:in `each'
/home/nkriege/src/nokogiri/lib/nokogiri/xml/reader.rb:99:in `block in attribute_nodes'
-- C level backtrace information -------------------------------------------
/opt/ruby-1.9.2/bin/ruby(rb_vm_bugreport+0x9e) [0x52381e]
/opt/ruby-1.9.2/bin/ruby() [0x564f58]
/opt/ruby-1.9.2/bin/ruby(rb_bug+0xb1) [0x5650f1]
/opt/ruby-1.9.2/bin/ruby() [0x4b2878]
/lib/libpthread.so.0(+0xf8f0) [0x7f4d187b68f0]
/opt/ruby-1.9.2/bin/ruby(st_lookup+0xe) [0x4ba11e]
/opt/ruby-1.9.2/bin/ruby(rb_method_entry+0x6f) [0x50f4cf]
/opt/ruby-1.9.2/bin/ruby() [0x518521]
/opt/ruby-1.9.2/bin/ruby() [0x519149]
/opt/ruby-1.9.2/bin/ruby(rb_yield+0x66) [0x521a06]
/opt/ruby-1.9.2/bin/ruby(rb_ary_each+0x45) [0x533225]
/opt/ruby-1.9.2/bin/ruby() [0x512cff]
/opt/ruby-1.9.2/bin/ruby() [0x514436]
/opt/ruby-1.9.2/bin/ruby() [0x519149]
/opt/ruby-1.9.2/bin/ruby(rb_yield+0x66) [0x521a06]
/opt/ruby-1.9.2/bin/ruby() [0x44aad1]
/opt/ruby-1.9.2/bin/ruby() [0x512cff]
/opt/ruby-1.9.2/bin/ruby() [0x514436]
/opt/ruby-1.9.2/bin/ruby() [0x519149]
/opt/ruby-1.9.2/bin/ruby(rb_yield+0x66) [0x521a06]
/opt/ruby-1.9.2/bin/ruby(rb_ary_each+0x45) [0x533225]
/opt/ruby-1.9.2/bin/ruby() [0x512cff]
/opt/ruby-1.9.2/bin/ruby() [0x514436]
/opt/ruby-1.9.2/bin/ruby() [0x519149]
/opt/ruby-1.9.2/bin/ruby(rb_yield+0x66) [0x521a06]
/opt/ruby-1.9.2/bin/ruby(rb_ary_each+0x45) [0x533225]
/opt/ruby-1.9.2/bin/ruby() [0x512cff]
/opt/ruby-1.9.2/bin/ruby() [0x514436]
/opt/ruby-1.9.2/bin/ruby() [0x519149]
/opt/ruby-1.9.2/bin/ruby(rb_vm_invoke_proc+0x9f) [0x51c03f]
/opt/ruby-1.9.2/bin/ruby(rb_exec_end_proc+0x238) [0x41c188]
/opt/ruby-1.9.2/bin/ruby() [0x41c254]
/opt/ruby-1.9.2/bin/ruby(ruby_cleanup+0x12d) [0x41c3cd]
/opt/ruby-1.9.2/bin/ruby(ruby_run_node+0x3d) [0x41c6dd]
/opt/ruby-1.9.2/bin/ruby(main+0x49) [0x419c19]
/lib/libc.so.6(__libc_start_main+0xfd) [0x7f4d17b7ac4d]
/opt/ruby-1.9.2/bin/ruby() [0x419b09]
I am running ruby 1.9.2p180 on 64bit Ubuntu Lucid. Here is my version info as output by the test helper.
{
"warnings"=>[],
"nokogiri"=>"1.4.4.2",
"ruby"=>{"version"=>"1.9.2", "platform"=>"x86_64-linux", "description"=>"ruby 1.9.2p180 (2011-02-18 revision 30909) [x86_64-linux]", "engine"=>"ruby"},
"libxml"=>{"binding"=>"extension", "compiled"=>"2.7.6", "loaded"=>"2.7.6"}
}
Are you duping your strings in some other way? Does my version of the test segfault for you?
Hey all. We actually know what the problems are. One is fixed on master and will be in 1.5.0.
The other problem with Reader is that it's returning Ruby Strings that are having the underlying C string swept out from under them by libxml2. The change to fix this is a little invasive, and has performance implications, so we're holding off on fixing until after 1.5.0 drops (this weekend hopefully?).
Reopening per my above comment -- commit e95a344 fixes one aspect of the Reader problem; but there is more work that needs to be done here.
When parsing a large ( 7.2 gig ) XMl file using Nokogiri::XML::Reader, I can reliably cause a segfault. Below is the rubinius crash report. I can generate a crash using matz ruby 1.8.7-p330, 1.9.2-p136 and rubinius 1.2.4dev. Below is the crash report generated by rubinius. I also can generate this error in both the latest stable version as well as version 1.5.0.beta.4 (which is what I used to generate this crash report).