copumpkin / nixpkgs

Nix Packages collection
Other
9 stars 1 forks source link

A pretty deep/annoying problem with multiple versions of the same thing #51

Open copumpkin opened 9 years ago

copumpkin commented 9 years ago

This one is going to take ingenuity. I've spotted evidence of it in various places so far, but the most recent is in libu2f-host. If you echo foo | u2f-host -osendrecv, you'll get an obscure segfault in objc_msgSend coming from a call to some CF functions from IOKit, which libu2f-host uses to talk to the HID device (it's a library for talking to a yubikey token).

The issue is that the final library is linking both against our own CF library (which maintains some global state of some sort that I haven't looked into yet), and via the impure IOKit, to the systemwide CF. Somehow the global state of the two CFs interferes somehow, and causes the resulting program to crash. If I relink the generated executable (and its library dependencies) to only the systemwide CF, everything works fine.

I can think of three approaches to this problem:

Any thoughts/advice, @shlevy, @gridaphobe, @joelteon, @jwiegley?

copumpkin commented 9 years ago

I think the CF problem (the only place I've observed this so far) can be summarized this way (from a cursory glance):

  1. CF's runtime maintains an internal private table of classes registered with it
  2. Each class knows where it comes from
  3. An instance created under one CF can't migrate to another CF (doesn't seem unreasonable)
  4. We occasionally migrate instances
  5. Boom!

My inclination is to do some hackery in our CFRuntime.c to detect if another CF is loaded, and if so delegate all its functionality out to the other one (possibly with some ugly hackery). If that sounds reasonable to others, I can give it a go.

jwiegley commented 9 years ago

This calls to mind a segmentation fault which I've seen all too many times with Python. In that case, the Python library that you link against must be the same as the dynamic library that gets loaded at runtime, or else there's an initialization collision that occurs on startup, I think it has to do with the number of facets that are assumed to have been allocated during initialization of the C++ standard library.

The reason why I mention this is that you may want to give some thought finding a more general solution to the problem, since I have a sneaking suspicion you're going to run into it in other areas as well.

wmertens commented 9 years ago

You should rename this issue to "fun with singletons" :)

Would another option be to prefix all methods of the CF runtime so you build a parallel pure-darwin universe? This will mean fixing up include files etc as well of course.

On Tue Feb 10 2015 at 9:45:35 AM John Wiegley notifications@github.com wrote:

This calls to mind a segmentation fault which I've seen all too many times with Python. In that case, the Python library that you link against must be the same as the dynamic library that gets loaded at runtime, or else there's an initialization collision that occurs on startup, I think it has to do with the number of facets that are assumed to have been allocated during initialization of the C++ standard library.

The reason why I mention this is that you may want to give some thought finding a more general solution to the problem, since I have a sneaking suspicion you're going to run into it in other areas as well.

— Reply to this email directly or view it on GitHub https://github.com/copumpkin/nixpkgs/issues/51#issuecomment-73663495.

copumpkin commented 9 years ago

@jwiegley are you talking about the one with MACOSX_DEPLOYMENT_TARGET? That's a different thing that I haven't yet tracked down, but that crashes dyld before even loading libraries like CoreFoundation, and I can reproduce it even with no nix in the picture at all. I decided to put that on hold until I could compile a debug version of dyld.

copumpkin commented 9 years ago

@wmertens that would work if they were actually overlapping symbols, but the issue is that libhidapi instantiates a CF object using our compiled CF, then passes that object into IOKit, which then uses methods from the system CF on the object and crashes. I think my approach will work though, if I can pull off the appropriate shenanigans.

shlevy commented 9 years ago

What exactly do you mean by your "broader solution" idea? Failing early in this case? Or actually working around it in nix?

copumpkin commented 9 years ago

I mean some sort of knowledge internal to nix. Something like propagatedBuildInputs, except propagatedIncompatibilityResolution. Then libhidapi would notice that some of its dependencies depend on one CF with propagatedIncompatibilityResolution, and that other dependencies are on a different CF (there would be some notion of a unique identifier that the two would need to share). When it notices this, it would go and override its inputs from one to the other.

Seems pretty gnarly, especially if stuff goes all the way back to the stdenv bootstrap. I'm leaning towards just patching our CF instead, since I don't think this thing will arise much.

shlevy commented 9 years ago

Ah, yeah. That's an interesting idea, having some way to specify that one dependency is incompatible with another on the nix level?

jwiegley commented 9 years ago

No, the issue I'm talking about happened a lot time ago to me during Ledger development, but it used to happen with such frequency that I added it to our FAQ.

copumpkin commented 9 years ago

Interesting! Have more details? Which FAQ is that?

jwiegley commented 9 years ago

It's mentioned in https://github.com/ledger/ledger/blob/next/INSTALL.md