Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.92k stars 548 forks source link

dl_unload_all_files() revisited #1718

Open p5pRT opened 24 years ago

p5pRT commented 24 years ago

Migrated from rt.perl.org#2968 (status was 'open')

Searchable as RT2968$

p5pRT commented 24 years ago

From @AlanBurlison

My current understanding is that this patch broke DBD​::Oracle on Linux - I'm not quite sure why. I can see a potential problem if a module uses call_atexit()\, and that then uses a bit of XS that has already been dlclosed\, but for DBD​::Oracle this doesn't seem to be the case.

I (respectfully!) disagree with Sarathay's assertion that dlclosing() stuff when the interpreter exits is an 'unnecessary overhead' - it is an *essential* part of the cleanup\, much in the same way as freeing up everything that has been malloced is. It is alright to not free and not dlclose if you know that the process is going to exit anyway\, but in the case of an embedded perl interpreter this is not necessarily the case.

I'd like to get the dlclose stuff into a state where it on by default\, as not having it there generates a particulary abstruse type of bug - it took the best part of a year to figure out why mod_perl dumped core when built with APXS.

I'd like some informed suggestions of the best way of doing this whithout causing collateral damage - anyone have any ideas?

Alan Burlison

p5pRT commented 24 years ago

From @timbunce

On Sun\, Apr 02\, 2000 at 11​:05​:41AM +0100\, Alan Burlison wrote​:

My current understanding is that this patch broke DBD​::Oracle on Linux - I'm not quite sure why.

I'll happily help as far as I can if you send me details.

(Oracle does very weird things with it's libraries\, and then does different weird things in the following release).

I can see a potential problem if a module uses call_atexit()\, and that then uses a bit of XS that has already been dlclosed\, but for DBD​::Oracle this doesn't seem to be the case.

I (respectfully!) disagree with Sarathay's assertion that dlclosing() stuff when the interpreter exits is an 'unnecessary overhead' - it is an *essential* part of the cleanup\, much in the same way as freeing up everything that has been malloced is. It is alright to not free and not dlclose if you know that the process is going to exit anyway\, but in the case of an embedded perl interpreter this is not necessarily the case.

I think Sarathy meant it's an 'unnecessary overhead' when _not_ embedding.

I'd like to get the dlclose stuff into a state where it on by default\,

When embedding\, yes.

as not having it there generates a particulary abstruse type of bug - it took the best part of a year to figure out why mod_perl dumped core when built with APXS.

I'd like some informed suggestions of the best way of doing this whithout causing collateral damage - anyone have any ideas?

Do dl_unload_all_files() if Perl_destruct_level > 0.

Tim.

p5pRT commented 12 years ago

From @Hugmeir

On Sun Apr 02 22​:51​:44 2000\, RT_System wrote​:

On Sun\, Apr 02\, 2000 at 11​:05​:41AM +0100\, Alan Burlison wrote​:

My current understanding is that this patch broke DBD​::Oracle on Linux - I'm not quite sure why.

I'll happily help as far as I can if you send me details.

(Oracle does very weird things with it's libraries\, and then does different weird things in the following release).

I can see a potential problem if a module uses call_atexit()\, and that then uses a bit of XS that has already been dlclosed\, but for DBD​::Oracle this doesn't seem to be the case.

I (respectfully!) disagree with Sarathay's assertion that dlclosing() stuff when the interpreter exits is an 'unnecessary overhead' - it is an *essential* part of the cleanup\, much in the same way as freeing up everything that has been malloced is. It is alright to not free and not dlclose if you know that the process is going to exit anyway\, but in the case of an embedded perl interpreter this is not necessarily the case.

I think Sarathy meant it's an 'unnecessary overhead' when _not_ embedding.

I'd like to get the dlclose stuff into a state where it on by default\,

When embedding\, yes.

as not having it there generates a particulary abstruse type of bug - it took the best part of a year to figure out why mod_perl dumped core when built with APXS.

I'd like some informed suggestions of the best way of doing this whithout causing collateral damage - anyone have any ideas?

Do dl_unload_all_files() if Perl_destruct_level > 0.

Tim.

I can't find the context behind this report. There.. seems to be a solution for the poster's issue at the end\, but does anyone know if this is still an issue?

Does anyone know what the issue _was_? : /

p5pRT commented 11 years ago

From @tonycoz

On Fri May 04 08​:52​:57 2012\, Hugmeir wrote​:

On Sun Apr 02 22​:51​:44 2000\, RT_System wrote​:

On Sun\, Apr 02\, 2000 at 11​:05​:41AM +0100\, Alan Burlison wrote​:

My current understanding is that this patch broke DBD​::Oracle on Linux - I'm not quite sure why.

I'll happily help as far as I can if you send me details.

(Oracle does very weird things with it's libraries\, and then does different weird things in the following release).

I can see a potential problem if a module uses call_atexit()\, and that then uses a bit of XS that has already been dlclosed\, but for DBD​::Oracle this doesn't seem to be the case.

I (respectfully!) disagree with Sarathay's assertion that dlclosing() stuff when the interpreter exits is an 'unnecessary overhead' - it is an *essential* part of the cleanup\, much in the same way as freeing up everything that has been malloced is. It is alright to not free and not dlclose if you know that the process is going to exit anyway\, but in the case of an embedded perl interpreter this is not necessarily the case.

I think Sarathy meant it's an 'unnecessary overhead' when _not_ embedding.

I'd like to get the dlclose stuff into a state where it on by default\,

When embedding\, yes.

as not having it there generates a particulary abstruse type of bug - it took the best part of a year to figure out why mod_perl dumped core when built with APXS.

I'd like some informed suggestions of the best way of doing this whithout causing collateral damage - anyone have any ideas?

Do dl_unload_all_files() if Perl_destruct_level > 0.

Tim.

I can't find the context behind this report. There.. seems to be a solution for the poster's issue at the end\, but does anyone know if this is still an issue?

Does anyone know what the issue _was_? : /

From reading the thread\, apparently we (were|are) not releasing loaded shared libraries when we clean up the interpreter.

This isn't a problem when perl is running standalone\, but is a problem when perl is embedded\, since we're continuing to use resources after the interpreter has been deleted.

As to the proposed solution\, I wonder if freeing the libraries on interpreter destruction is safe - it should be sort-of-safe where the underlying library management tools reference count shared libraries (true for dlopen and MSWin32)\, but I don't think it's safe otherwise.

Also\, there doesn't seem to be an XS entry point (like boot_Foo for module Foo) to do clean up for a loaded XS module\, which means memory may leak anyway (and other worse things\, like dangling signal handlers.)

Tony

p5pRT commented 11 years ago

From @nwc10

TL;DR​: It's a mess.

On Fri\, Aug 09\, 2013 at 12​:35​:41AM -0700\, Tony Cook via RT wrote​:

On Fri May 04 08​:52​:57 2012\, Hugmeir wrote​:

On Sun Apr 02 22​:51​:44 2000\, RT_System wrote​:

On Sun\, Apr 02\, 2000 at 11​:05​:41AM +0100\, Alan Burlison wrote​:

[whether at exit the interpreter should attempt to dl_unload() all files it dl_load()ed]

I'd like some informed suggestions of the best way of doing this whithout causing collateral damage - anyone have any ideas?

Do dl_unload_all_files() if Perl_destruct_level > 0.

Tim.

I can't find the context behind this report. There.. seems to be a solution for the poster's issue at the end\, but does anyone know if this is still an issue?

As an aside\, I can't figure out how this has actually come to be recorded as a ticket in RT\, given that it seems to be a 2 message thread sent just to the mailing list\, back in 2000​:

http​://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2000-04/msg00091.html

(Back before the ticketing system was on RT. So what's the old bug number?)

Does anyone know what the issue _was_? : /

From reading the thread\, apparently we (were|are) not releasing loaded shared libraries when we clean up the interpreter.

This isn't a problem when perl is running standalone\, but is a problem when perl is embedded\, since we're continuing to use resources after the interpreter has been deleted.

As to the proposed solution\, I wonder if freeing the libraries on interpreter destruction is safe - it should be sort-of-safe where the underlying library management tools reference count shared libraries (true for dlopen and MSWin32)\, but I don't think it's safe otherwise.

Also\, there doesn't seem to be an XS entry point (like boot_Foo for module Foo) to do clean up for a loaded XS module\, which means memory may leak anyway (and other worse things\, like dangling signal handlers.)

I've looked at this and the related thread "perl_atexit() considered useless"

http​://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2000-09/msg00846.html

Relevant commits from the period are​:

commit abb9e9dca5a5f1213886f2e81a42c9a565def727 Author​: Gurusamy Sarathy \gsar@​cpan\.org Date​: Wed Mar 1 00​:46​:44 2000 +0000

  unload extension shared objects when exiting\, implemented   only for dl_dlopen.xs (from Alan Burlison)  
  p4raw-id​: //depot/perl@​5381

commit 23d2500b2b45b1beddc8de6ccd7c60068286d061 Author​: Gurusamy Sarathy \gsar@​cpan\.org Date​: Wed Mar 22 18​:41​:50 2000 +0000

  make unloading of extension shared objects (change#5381) a build   option (use "Configure -Accflags=-DDL_UNLOAD_ALL_AT_EXIT" to enable)  
  p4raw-link​: @​5381 on //depot/perl​: abb9e9dca5a5f1213886f2e81a42c9a565def727  
  p4raw-id​: //depot/perl@​5885

Both are just before v5.6.0 was released. Nothing has been changed in the code since. (The documentation failed to note the second commit until yesterday)

It's a mess.

I can't actually work out what the problem was that Alan was reporting in the second thread ("perl_atexit() considered useless").

  I've been looking at memory errors in perl\, and found that if   DL_UNLOAD_ALL_AT_EXIT is defined\, it causes heap problems on exit. The   cause is that the AV holding the references to the dlopen() handles has   been reclaimed before the perl atexit processing takes place to   dlclose() the handles. I could fix this by keeping a list of the dlopen   handles locally in DynaLoader.xs\, but I think there is a more general   problem here - what use is perl_atexit() if it is called after globals   have been destroyed?

perl_atexit() is called after object destruction\, but *before* all the unblessed SVs gets destroyed\, and the AV in question is @​DynaLoader​::dl_librefs\, which is not blessed. So what he describes shouldn't happen.

I've also built blead from that era with gcc 4.8.1 and ASAN. Yes\, this works. I was a bit surprised how easy it was.

The build and most of the tests pass. There are a couple of bugs tickled by regression tests which ASAN spots\, but mostly 5.6.0 was pretty clean. (No doubt in large part because Alan did a lot of work with purify around that time to identify and fix problems.) But ASAN can't find any sort of heap error. So I'm confused.

I think I can see how the DBD​::Oracle problem might be happening. It's likely the same sort of thing as this bit of code in DynaLoader.t​:

  SKIP​: {   skip "unloading unsupported on $^O"\, 2   if ($old_darwin || $^O eq 'VMS');   my $module = pop @​loaded_modules;   skip "File​::Glob sets PL_opfreehook"\, 2 if $module eq 'File​::Glob';   my $r = eval { DynaLoader​::dl_unload_file($libref) };

in that\, if DBD​::Oracle is using something (probably tie magic) that is a pointer into the shared object held by an SV (or similar) that survives until late in global destruction\, then the point at which perl_atexit() runs is before then\, and so it creates dangling pointers.

DL_UNLOAD_ALL_AT_EXIT is not the default. I'm not sure if anyone is building with it. Certainly\, no-one is building with it and ithreads\, because there is an explosion of fail if you do. The problem is that DynaLoader is using perl_atexit() to unmap shared libraries\, and that's per *interpreter* exit. So as soon as an ithread is spawned\, its exit triggers the unmapping of all shared objects\, and the next time the parent thread tries to use one\, kaboom!

I suspect that no-one is building with it\, because since this (innocent) commit\,

commit 667763bdbf37a30596512ca0a08a720d86c7e2a8 Author​: Father Chrysostomos \sprout@​cpan\.org Date​: Thu Oct 4 21​:56​:00 2012 -0700

  Make PerlIO​::encoding more resilient to buffer changes

ext/PerlIO-encoding/t/encoding.t will start SEGVing\, because something still in use during global destruction is unmapped.

The issues are sort-of threefold

1) As ithreads are effectively forks as far as the interpreter goes\, but are   just threads as far as the OS knows\, just calling perl_atexit() is the   wrong point to call any sort of unloading\, because it will happen in each   ithreads   We (at least) need to only do it in the top level interpreter   However\, we'd still leak if child interpreters dl_open() more things\, so I   guess that this means that ideally it ought to be a shared structure   tracking this stuff 2) perl_atexit() fires too early during global destruction to be the right   place to run this. As it's C-only code that needs running\, it ought to be   really late in global destruction\, after all SVs have been freed.   But the point at which perl_atexit() runs is (differently) useful so it   can't be changed. It's sort of feeling like we need two (or more?) hook   points in the global destruction sequence 3) as to dl_unload_file() at any other time\, and as demonstrated by   File​::Glob\, we have a mess. Right now\, it's never safe to unload arbitrary   modules   a) because we don't have a way to tell them to free up things   b) possibly because they know that they can't (eg OP free hooks)   c) definitely because the way we currently do hooks\, eg OP free hooks\, we   encourage modules to store the old hook value\, and restore it later.   Which means that any other module might have a pointer to File​::Glob's   innards\, which File​::Glob has no idea about

Point (3) doesn't matter for last-stage global destruction unloading\, which I think was the specific problem that Alan was describing\, and I suspect is the least impossible to fix.

I think to properly solve the "at interpreter exit" cleanup we'd really need to change dl_load_file() to internally track what it's been called for (tracking shared across threads)\, and (I think) how often\, with ownership being upped on thread clone. At which point\, interpreter exit is safe to call dl_unload_all_files(). But only really late.

The current code is *in XS*\, and kicks back to more XS code. I think that it could be replaced by pure Perl code (as-is) with no loss of functionality. But no actual bug fixes either.

#ifdef DL_UNLOAD_ALL_AT_EXIT /* Close all dlopen'd files */ static void dl_unload_all_files(pTHX_ void *unused) {   CV *sub;   AV *dl_librefs;   SV *dl_libref;

  if ((sub = get_cvs("DynaLoader​::dl_unload_file"\, 0)) != NULL) {   dl_librefs = get_av("DynaLoader​::dl_librefs"\, 0);   while ((dl_libref = av_pop(dl_librefs)) != &PL_sv_undef) {   dSP;   ENTER;   SAVETMPS;   PUSHMARK(SP);   XPUSHs(sv_2mortal(dl_libref));   PUTBACK;   call_sv((SV*)sub\, G_DISCARD | G_NODEBUG);   FREETMPS;   LEAVE;   }   } } #endif

Part of the fun of changing things is that it's provable that other code out there already manipulates @​Dynaloader​::dl_librefs directly​:

  http​://grep.cpan.me/?q=dl_librefs

but I'm not sure if all code doing so

1) only adds *all* handles it got by calling dl_load_file() 2) takes no action based on @​dl_librefs - it thinks that it's just doing   booking for DynaLoader

ie - I've not looked at the code on CPAN closely enough to work out whether it would work to change dl_load_file() to internally track what it was called with\, and then have dl_unload_all_files() uses that and ignore @​DynaLoader​::dl_librefs.

We possibly also need *two* hooks into shared objects. One for "this interpreter is closing down"\, which would probably be XS code\, and called just before object cleanup. And a second\, C code\, for "the last interpreter is now about to unload you". Although\, botherit\, we might also want an XS level hook for "we're about to start global destruction and you will be freed at the end".

Problem is that\, like PerlIO\, we want to be able to do things both when there still is an interpreter (to be able to use it)\, and after it's gone (because it was relying on things we provide)

We'd still be trusting shared objects not to leave signal handers live. (and not to use atexit() routines\, and I guess pthread_atfork() handlers\, as it looks like neither can be de-registered)

To solve the more general problem of dl_unload_file() - ie to be able to unload shared objects *before* global destruction\, I think that we'd also need to revisit how we're doing hooks (*all* of them)\, and provide some mechanism to register handlers for op_free\, op_check\, opcode overrides\, so that *only* the interpreter is remembering pointers to extension code. We'd then need to "fix" everything on CPAN to be conformant.

And I think we'd also need those hooks described above\, so that XS modules can be told "you need to clean up" and undo whatever is necessary to unlink themselves from the interpreter.

It's hard work to fix the first problem. The "fix everything on CPAN" means that it's impossible to ever completely fix the second. However\, having a better hook system would be useful in its own right\, even if we never get to the point of doing enough to fix the general dl_unload_file() issue.

Nicholas Clark

p5pRT commented 11 years ago

From @bulk88

On Fri Aug 09 00​:35​:40 2013\, tonyc wrote​:

From reading the thread\, apparently we (were|are) not releasing loaded shared libraries when we clean up the interpreter.

This isn't a problem when perl is running standalone\, but is a problem when perl is embedded\, since we're continuing to use resources after the interpreter has been deleted.

As to the proposed solution\, I wonder if freeing the libraries on interpreter destruction is safe - it should be sort-of-safe where the underlying library management tools reference count shared libraries (true for dlopen and MSWin32)\, but I don't think it's safe otherwise.

Also\, there doesn't seem to be an XS entry point (like boot_Foo for module Foo) to do clean up for a loaded XS module\, which means memory may leak anyway (and other worse things\, like dangling signal handlers.)

END block? package var with a DESTROY? I think I once did it with an XS module by having an OS mutex protect a C reference count\, to run the unloader\, whatever Perl level thing that triggered it has to obtain the lock\, and the count must be 0\, objects cant initialize from other perl threads because the mutex is being held. When the first thread clears all the static resources and releases the lock\, the 2nd thread will reinitialize the DLL again. Also -1 could be placed in the reference count to indicate to never reinitialize the DLL again in the process and return an error/undef on Perl lang level if an object new was tried. I am not referring to the Windows DLL loader\, just Perl XS concepts.

Windows has DllMain which is always called at DLL unload time\, but I dont think that exists on any other OS.

Tony

related link http​://www.nntp.perl.org/group/perl.perl5.porters/2000/02/msg8051.html

-- bulk88 ~ bulk88 at hotmail.com

p5pRT commented 11 years ago

From @nwc10

On Thu\, Aug 22\, 2013 at 07​:09​:36AM -0700\, bulk88 via RT wrote​:

On Fri Aug 09 00​:35​:40 2013\, tonyc wrote​:

From reading the thread\, apparently we (were|are) not releasing loaded shared libraries when we clean up the interpreter.

This isn't a problem when perl is running standalone\, but is a problem when perl is embedded\, since we're continuing to use resources after the interpreter has been deleted.

As to the proposed solution\, I wonder if freeing the libraries on interpreter destruction is safe - it should be sort-of-safe where the underlying library management tools reference count shared libraries (true for dlopen and MSWin32)\, but I don't think it's safe otherwise.

Also\, there doesn't seem to be an XS entry point (like boot_Foo for module Foo) to do clean up for a loaded XS module\, which means memory may leak anyway (and other worse things\, like dangling signal handlers.)

END block? package var with a DESTROY? I think I once did it with an XS

As Ilya Z noted in one of the threads\, Alan's initial suggestion of END is way to early. END is run before objects are cleaned up. A package var with a DESTROY would trigger *while* objects were being cleared up\, which is still too early\, as any other objects relying on C or XS code in that library might not yet be cleared up. (This is the same bug as Reini figured out in Storable)

Ilya Z noted that it really needs to be done after everything Perl-like is cleared up​:

  Alan Burlison writes​:   > > Couldn't it be done in an END block inside DynaLoader?

  Too early. The only safe place (if any) should be   after-the-end-of-global-destruction (assuming the list is kept in a C   structure\, thus is indestructible).

  Ilya

http​://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2000-01/msg00843.html

which sort of gets asked again by Sarathy\, to which Alan replies "pass"​:

  >> There was some doubt in my mind whether call_atexit() would do the job   >> properly. What happens if the dll has allocated SVs? The call_atexit()   >> callbacks are called after objects are destroyed but *before* the SV   >> arenas are deallocated. Do you foresee any problems from that?   >   >Pass. Does that mean that @​dl_librefs will have been reclaimed?  
  Not unless @​dl_librefs is an object.  
  > If so\,   >the patch seems to work remarkably well considering it is popping stuff   >of a free'd AV. I don't think the dll allocating SVs has anything to do   >with it - dlclose() doesn't do any sort of cleanup [...]  
  OK\, I think that answers my question.  
  >I wrote the patch based on what I thought was the consensus on the   >correct way to do this - was I mislead?  
  Doesn't look much like it. I just want to be sure we aren't overlooking   any sharp corners.

http​://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2000-02/msg01594.html

I think that we are overlooking potential sharp corners\, and that Ilya Z remains correct - if we are going to unload dynamic libraries\, we have to do it very late\, after we've cleaned up all perl-related data structures\, just before we free the interpreter structure.

http​://www.nntp.perl.org/group/perl.perl5.porters/2000/02/msg8051.html

Thanks for the link

Nicholas Clark