ocaml / flexdll

a dlopen-like API for Windows
Other
100 stars 30 forks source link

Fix some bugs, add some features #54

Closed db4 closed 6 years ago

db4 commented 6 years ago

Fixes

Features

alainfrisch commented 6 years ago

as discussed in email conversation with Alain Frisch, Nov 25 2013

For future reference, and for interested readers, here is your original (excellent!) report:


Hi Alain,

I stumbled across an access violation bug caused by the way FlexDLL
performs relocations. When constructing dynamic relocation tables it
packs several consecutive relocation addresses into one "zone". When
performing actual relocations it allows write access via
VirtualProtect(zone_begin, zone_size, PAGE_EXECUTE_WRITECOPY) (BTW,
why not PAGE_EXECUTE_READWRITE?), then patches addresses and then
restore the old access rights via VirtualProtect(zone_begin,
zone_size, old_access). Unfortunately this is not entirely correct. I
see the situation where one zone covers 2 memory pages with different
access rights: PAGE_READONLY and PAGE_EXECUTE_READ. On restore the
second page will be set PAGE_READONLY and any attempt to execute code
from it will lead to an access violation error (as CPU supports NX and
DEP is enabled in the OS).

Relocation table that causes troubles:

Dynamic relocation table found at 634B13A0
 Non-writable relocation in zone 63483D57 -> 63483E83
 Non-writable relocation in zone 63483BB8 -> 63483CD5
 Non-writable relocation in zone 63480BE5 -> 634811B8
 634811B8 (kind:0100) (now:00C363E9)  caml_local_roots
 634811B4 (kind:0100) (now:00005E58)  caml_local_roots
 63480BE9 (kind:0100) (now:00000000)  caml_atom_table
 63480BE5 (kind:0100) (now:00000000)  caml_atom_table
 63483CD5 (kind:0102) (now:013D2900)  caml_copy_string
 63483CB8 (kind:0101) (now:0032D3F8)  w32_unicode_pack_to_string
 63483CA5 (kind:0102) (now:013FC224)  caml_atom_table
 63483C8C (kind:0101) (now:9DF62F20)  caml_initialize
 63483C4E (kind:0101) (now:9DF62BFE)  caml_alloc_shr
 63483BE9 (kind:0101) (now:0032D526)  w32_uuid_ml2c
 63483BB8 (kind:0101) (now:0032D4CB)  w32_uuid_c2ml
 63483E83 (kind:0102) (now:013FB4D0)  caml_local_roots
 63483E68 (kind:0102) (now:013FB4D0)  caml_local_roots
 63483D81 (kind:0102) (now:013FB4D0)  caml_local_roots
 63483D75 (kind:0102) (now:013FB4D0)  caml_local_roots
 63483D57 (kind:0102) (now:013FB4D0)  caml_local_roots

63480000 page was PAGE_READONLY
63481000 page was PAGE_EXECUTE_READ

Another problem is that one page can be covered by several zones. As
you restore access rights in the same order you modify them, they will
be incorrect in the end. I was able to fix this just replacing

  /* Restore permissions. Should do it also in case of failure... */
  for (wr = tbl->nonwr; wr->last != 0; wr++)
    allow_write(wr->first,wr->last + 4,wr->old,&wr->old);

with

  /* Restore permissions in the reverse order. Should do it also in
case of failure... */
  wr = tbl->nonwr;
  while ((wr+1)->last != 0)
    wr++;
  for (; wr >= tbl->nonwr; wr--)
    allow_write(wr->first,wr->last + 4,wr->old,&wr->old);

but I still have no idea how to fix the first problem. Now I just use

    allow_write(wr->first,wr->last + 4,(wr->old ==
PAGE_READONLY)?PAGE_EXECUTE_READ:wr->old,&wr->old);

as a workaround in hope that someone will fix it properly.

and my proposal fix strategy at that time:

1. First version:  one changes the right accesses for each relocation and immediately restore it after the patch is done.  This even simplifies the work of flexlink.exe, since it no longer needs to compute these "zones".

2. Second version: *if* it turns out to be too expensive to change the rights on memory pages for each relocation, one can implement the simplest strategy of checking if the next relocation falls in the same page as the previous one, and in this case, avoid the two useless calls to VirtualProtect.  One should then arrange so that flexlink.exe emits relocations in a sorted order (it might already be the case). 

I think one should evaluate the performance impact of changing access rights on each relocation. There can be many relocations, and I'm concerned that this could introduce a significant slowdown when loading a DLL. Have you tried to assess the performance impact?

dra27 commented 6 years ago

The putenv thing is surprising me, because putenv does call SetEnvironmentVariable (see the CRT sources). Do you have a repro case? I'm surprised, given where flexdll hooks the startup, that both CRTs are able to load and cache the environment in time for this to be a problem, but I guess I'm just not seeing it?

db4 commented 6 years ago

The putenv thing is surprising me, because putenv does call SetEnvironmentVariable (see the CRT sources).

Yes, you are right. I can't reproduce the problem with modern Visual Studio so I will exclude the questionable commit.

db4 commented 6 years ago

@dra, I think I've recalled when getenv/putenv failed for me in a mixed CRT environment. I had main non-OCaml exe linked with shared CRT, main OCaml dll linked with static CRT and OCaml plugin dll linked with shared CRT. Let's call then main_shared, ocaml_static and plugin_shared. Now main_shared loads ocaml_static that in turn loads plugin_shared via dlopen API. The problem is that when plugin_shared is loaded, the internal shared CRT environment table is already initialized (in main_shared), and putenv/SetEnvironmentVariable in ocaml_static has no effect on it.

So SetEnvironmentVariable/GetEnvironmentVariable still looks better that putenv/getenv. Anyway flexdll is Windows-specific and there is no reason to prefer CRT functions over Win32 API ones.

dra27 commented 6 years ago

Ah, that does make sense! There is a reason to prefer MSCRT functions over API calls in general because, as here, it can confuse other consumers of MSCRT. IIRC having mixed static and shared CRT is not recommended at all, though?

That said, it's all academic for flexdll - the important thing is communicating that function pointer, which at some point we'll do using something better than an environment variable anyway!

db4 commented 6 years ago

IIRC having mixed static and shared CRT is not recommended at all, though?

Well, you can model the same situation using shared CRTs of different Visual Studio versions. It's not uncommon when different DLLs in one application are linked with different CRTs. Not very likely for OCaml plugins, but flexdll is advertised as a general-purpose tool, right?

That said, it's all academic for flexdll - the importing thing is communicating that function pointer, which at some point we'll do using something better than an environment variable anyway!

Several years ago I hit the problem and fixed it the most simple way. If we introduce more reliable method of passing a pointer - of course, it would be better. But I doubt it will happen in the foreseeable future :)

alainfrisch commented 6 years ago

Any thought on:

I think one should evaluate the performance impact of changing access rights on each relocation. There can be many relocations, and I'm concerned that this could introduce a significant slowdown when loading a DLL. Have you tried to assess the performance impact?