garu / Clone

recursively copy Perl datatypes
7 stars 10 forks source link

Clone-0.40 crashes DBIx::Class #10

Closed h3kker closed 5 years ago

h3kker commented 5 years ago

Hi! I was trying to update my environment to the latest versions, but I failed to install DBIx::Class after updating Clone to 0.40. Some tests suddenly crash with segmentation faults:

Tested with perl-5.28.0 and perl-5.24.3 After downgrading to Clone-0.39 the tests run fine. Thanks, Heinz

Kolunchik commented 5 years ago

Confirm, perl 5.24.3, amd64

atoomic commented 5 years ago

going to check what's wrong there

atoomic commented 5 years ago

need a debug perl but can reproduce the issue while running perl -Ilib t/19retrieve_on_insert.t under gdb with perl 5.28.0 and DBIx::Class v0.082841

# gdb output
ok 1 - Default insert successful
ok 2 - Without retrieve_on_insert, check rank

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7a07d15 in Perl_newSVhek () from /usr/local/cpanel/3rdparty/perl/528/lib/perl5/5.28.0/x86_64-linux-64int/CORE/libperl.so

using a debug perl, the SEGV is happening from HEK_FLAGS(hek)

0x00007ffff799c54e in Perl_newSVhek (hek=0xe295f8) at sv.c:9368
  >│9368            const int flags = HEK_FLAGS(hek);
(gdb) p *hek
$2 = {
  hek_hash = 2074141892,
  hek_len = 67108868,
  hek_key =     "r"
}
(gdb) bt
#0  0x00007ffff799c54e in Perl_newSVhek (hek=0xcc0748) at sv.c:9368
#1  0x00007ffff7967827 in Perl_hv_pushkv (hv=0x1e8c430, flags=3) at hv.c:1029
#2  0x00007ffff7978695 in S_padhv_rv2hv_common (hv=0x1e8c430, gimme=3 '\003', is_keys=false, has_targ=true) at pp_hot.c:
1783
#3  0x00007ffff797960e in Perl_pp_rv2av () at pp_hot.c:2004
#4  0x00007ffff7972179 in Perl_runops_standard () at run.c:41
#5  0x00007ffff7895e68 in S_run_body (oldscope=1) at perl.c:2694
#6  0x00007ffff7895945 in perl_run (my_perl=0x603010) at perl.c:2617
#7  0x0000000000400e7a in main (argc=3, argv=0x7fffffffdf28, env=0x7fffffffdf48) at perlmain.c:122

the hek_len being set to 67108868 is wrong and this is why we got a SEGV there... would need to know where it's coming from...

atoomic commented 5 years ago

no surprise it's coming from 4435df7edb07a6ac2084af5f4eda24a128d968b0

atoomic commented 5 years ago

the hek_key is for the 'rand' HE which is COWed and clone three times...

HV 0x1e88198: keys=4 ; max=7
bucket #0: 0x1e918c0 [ 1 element(s) ]
  name
bucket #4: 0x1e91968 [ 1 element(s) ]
  artistid
bucket #6: 0x1e918a8 [ 1 element(s) ]
  charfield
bucket #7: 0x1e918d8 [ 1 element(s) ]
  rank
atoomic commented 5 years ago

it appears that the COWRefcnt on the COWed PV is 0... clone is called from Hash::Merge

clone called...  at /usr/local/cpanel/3rdparty/perl/528/lib/perl5/cpanel_lib/Hash/Merge.pm line 223, <ARGV> line 1.
    Hash::Merge::merge() called at lib/DBIx/Class/ResultSet.pm line 3972
    DBIx::Class::ResultSet::_merge_attr() called at lib/DBIx/Class/ResultSet.pm line 647
    DBIx::Class::ResultSet::_normalize_selection() called at lib/DBIx/Class/ResultSet.pm line 336
    DBIx::Class::ResultSet::new() called at lib/DBIx/Class/Storage/DBI.pm line 2032
    DBIx::Class::Storage::DBI::insert() called at lib/DBIx/Class/Row.pm line 408
    DBIx::Class::Row::insert() called at lib/DBIx/Class/ResultSet.pm line 2894
    DBIx::Class::ResultSet::create() called at t/19retrieve_on_insert.t line 27
    Test::Exception::lives_ok() called at t/19retrieve_on_insert.t line 27
atoomic commented 5 years ago

Strangely we can get some COWPV with a COWREFCNT=0... not sure why and if it's real valid, the current fix will simply leave alone such PV...

we can consider fixing the clone or also the source when we see such a thing but not sure it's a good idea for now... putting that patch here to do not lost it if needed later

diff --git a/Clone.xs b/Clone.xs
index 0e840f1..78721fa 100755
--- a/Clone.xs
+++ b/Clone.xs
@@ -198,6 +198,13 @@ sv_clone (SV * ref, HV* hseen, int depth)

         } else {
           clone = newSVsv (ref);
+
+          /* just fixed our clone... */
+          if ( SvIsCOW(clone) && !SvOOK(clone) && CowREFCNT(clone) == 0 ) {
+            SvIsCOW_off(clone); /* drop the COW flag on the clone if CowREFCNT == 0 */
+              //SvIsCOW_off(ref);
+          }
+
         }
 #else
         clone = newSVsv (ref);
atoomic commented 5 years ago

after some extra checks the suggested patch is probably not good enough, I would suggest to revert 4435df7 for now

atoomic commented 5 years ago

d9fad1a seems to be a better alternate for now, the addition was to bypass the COW clone when LEN=0

eserte commented 5 years ago

Also a possible victim of newer Clone versions: MBRADSHAW/Mail-Milter-Authentication-2.20181024.tar.gz Some tests in the test suite fail with core dumps:

#0  0x00000000005263a0 in Perl_bytes_to_utf8 ()
#1  0x00000000004d075b in Perl_newSVhek ()
#2  0x00000000004b639d in Perl_hv_iterkeysv ()
#3  0x00000000422094b8 in sv_clone () from /usr/perl5.24.0p/lib/site_perl/5.24.0/amd64-freebsd/auto/Clone/Clone.so
#4  0x0000000042209541 in sv_clone () from /usr/perl5.24.0p/lib/site_perl/5.24.0/amd64-freebsd/auto/Clone/Clone.so
#5  0x00000000422094d4 in sv_clone () from /usr/perl5.24.0p/lib/site_perl/5.24.0/amd64-freebsd/auto/Clone/Clone.so
#6  0x0000000042209541 in sv_clone () from /usr/perl5.24.0p/lib/site_perl/5.24.0/amd64-freebsd/auto/Clone/Clone.so
#7  0x0000000042209118 in XS_Clone_clone () from /usr/perl5.24.0p/lib/site_perl/5.24.0/amd64-freebsd/auto/Clone/Clone.so
#8  0x00000000004c211c in Perl_pp_entersub ()
#9  0x00000000004ba063 in Perl_runops_standard ()
#10 0x0000000000442338 in perl_run ()
#11 0x0000000000422c66 in main ()
eserte commented 5 years ago

And another possible failure: https://rt.cpan.org/Ticket/Display.html?id=127441

garu commented 5 years ago

Thanks @h3kker, @Kolunchik and @eserte for identifying and confirming the issue, and thanks @atoomic for the thorough debugging and fixing. After merging the fix I was able to install DBI, DBIx::Class without issues on 5.28.0.

I was not able to install MBRADSHAW/Mail-Milter-Authentication-2.20181024.tar.gz nor QBit::Application::Model::DBManager but checking the error log I saw nothing to do with Clone.

Please fetch Clone 0.41 on a repository near you!

@eserte is there a way to mark 0.40 as unfit for smoking?

atoomic commented 5 years ago

@eserte cc: @garu I can confirm the SEGV of QBit-Application-Model-DBManager-0.021 using Clone 0.40 and can also confirm that the bug is fixed with d9fad1a6322e655c6163e111c76e66f1275d3385

When debugging without the fix I can notice that this is the same issue while trying to do a clone on a SvPV tagged COWPV but having a len=0...

BEFORE Flags 0x10004403 CUR 2 LEN 10 CowREFCNT 1 PV 0x247f6e0
BEFORE Flags 0x10004403 CUR 9 LEN 11 CowREFCNT 1 PV 0x1de6d60
BEFORE Flags 0x10004403 CUR 11 LEN 13 CowREFCNT 1 PV 0x1e24b10
BEFORE Flags 0x10004403 CUR 6 LEN 10 CowREFCNT 1 PV 0x1ec5fc0
BEFORE Flags 0x10004403 CUR 6 LEN 10 CowREFCNT 1 PV 0x1e83af0
BEFORE Flags 0x10004403 CUR 9 LEN 11 CowREFCNT 1 PV 0x1ebffc0
BEFORE Flags 0x10004403 CUR 11 LEN 13 CowREFCNT 1 PV 0x1661a90
BEFORE Flags 0x10004403 CUR 2 LEN 0 CowREFCNT 0 PV 0x1da08f0
[1]    25565 segmentation fault  perl t/qbit_application_model_dbmanager_fields.t

In short this is going to be fixed with the next release 0.41, but this is broken with 0.40

thanks

atoomic commented 5 years ago

@garu could we have a new release with that fix? otherwise I think it will bite more than one person thanks

thomasbalsloev commented 5 years ago

We are also seeing this bug with version 0.40 on our servers, so a new release would be greatly appreciated :)

garu commented 5 years ago

@atoomic @P9GIT Clone 0.41 was out when I wrote my comment above! Please check if it works for you :)

Kolunchik commented 5 years ago

0.41 works good for me, thanks!

thomasbalsloev commented 5 years ago

@garu

0.41 works for us too. Thanks!

eserte commented 5 years ago

@eserte is there a way to mark 0.40 as unfit for smoking?

IMHO it's better that affected 3rd party modules increase the minimum Clone version in their prerequisite lists (or alternatively make sure that the minimum version is set to 0.41 if 0.40 is currently installed). This way not only smokers will be "fixed", but also end user installations.