ocaml / flexdll

a dlopen-like API for Windows
Other
98 stars 30 forks source link

Rewrite Visual Studio/Windows SDK detection #13

Closed db4 closed 8 years ago

db4 commented 8 years ago

I've rewritten VS/WinSDK detection to be much more simple and compatible. It should now support all Visual Studio/Windows SDK flavors which was not the case before (e.g. it failed with Visual Studio 2013)

db4 commented 8 years ago

Fix access violation bug using the simpler approach as discussed in the following conversation:

On 11/25/2013 09:32 AM, Dmitry Bely wrote:

Hi Alain,

I stumbled across an access violation bug caused by the way FlexDLL performs relocations. When constructing dynamic relocation tables it packs several consecutive relocation addresses into one "zone". When performing actual relocations it allows write access via VirtualProtect(zone_begin, zone_size, PAGE_EXECUTE_WRITECOPY) (BTW, why not PAGE_EXECUTE_READWRITE?), then patches addresses and then restore the old access rights via VirtualProtect(zone_begin, zone_size, old_access). Unfortunately this is not entirely correct. I see the situation where one zone covers 2 memory pages with different access rights: PAGE_READONLY and PAGE_EXECUTE_READ. On restore the second page will be set PAGE_READONLY and any attempt to execute code from it will lead to an access violation error (as CPU supports NX and DEP is enabled in the OS).

Relocation table that causes troubles:

Dynamic relocation table found at 634B13A0 Non-writable relocation in zone 63483D57 -> 63483E83 Non-writable relocation in zone 63483BB8 -> 63483CD5 Non-writable relocation in zone 63480BE5 -> 634811B8 634811B8 (kind:0100) (now:00C363E9) caml_local_roots 634811B4 (kind:0100) (now:00005E58) caml_local_roots 63480BE9 (kind:0100) (now:00000000) caml_atom_table 63480BE5 (kind:0100) (now:00000000) caml_atom_table 63483CD5 (kind:0102) (now:013D2900) caml_copy_string 63483CB8 (kind:0101) (now:0032D3F8) w32_unicode_pack_to_string 63483CA5 (kind:0102) (now:013FC224) caml_atom_table 63483C8C (kind:0101) (now:9DF62F20) caml_initialize 63483C4E (kind:0101) (now:9DF62BFE) caml_alloc_shr 63483BE9 (kind:0101) (now:0032D526) w32_uuid_ml2c 63483BB8 (kind:0101) (now:0032D4CB) w32_uuid_c2ml 63483E83 (kind:0102) (now:013FB4D0) caml_local_roots 63483E68 (kind:0102) (now:013FB4D0) caml_local_roots 63483D81 (kind:0102) (now:013FB4D0) caml_local_roots 63483D75 (kind:0102) (now:013FB4D0) caml_local_roots 63483D57 (kind:0102) (now:013FB4D0) caml_local_roots

63480000 page was PAGE_READONLY 63481000 page was PAGE_EXECUTE_READ

Another problem is that one page can be covered by several zones. As you restore access rights in the same order you modify them, they will be incorrect in the end. I was able to fix this just replacing

   /* Restore permissions. Should do it also in case of failure... */
   for (wr = tbl->nonwr; wr->last != 0; wr++)
     allow_write(wr->first,wr->last + 4,wr->old,&wr->old);

with

   /* Restore permissions in the reverse order. Should do it also in case of failure... */
   wr = tbl->nonwr;
   while ((wr+1)->last != 0)
     wr++;
   for (; wr >= tbl->nonwr; wr--)
     allow_write(wr->first,wr->last + 4,wr->old,&wr->old);

but I still have no idea how to fix the first problem. Now I just use

     allow_write(wr->first,wr->last + 4,(wr->old == PAGE_READONLY)?PAGE_EXECUTE_READ:wr->old,&wr->old);

as a workaround in hope that someone will fix it properly.

  • Dmitry Bely

On Tue, Nov 26, 2013 at 1:10 PM, Alain Frisch alain@frisch.fr wrote: Hi Dmitry,

Thanks for reporting the problem and its precise analysis.

I propose the following approach:

  1. First version: one changes the right accesses for each relocation and immediately restore it after the patch is done. This even simplifies the work of flexlink.exe, since it no longer needs to compute these "zones".
  2. Second version: if it turns out to be too expensive to change the rights on memory pages for each relocation, one can implement the simplest strategy of checking if the next relocation falls in the same page as the previous one, and in this case, avoid the two useless calls to VirtualProtect. One should then arrange so that flexlink.exe emits relocations in a sorted order (it might already be the case).

What do you think? I won't have time to work on a patch in the next days. So feel free to give it a try if you wish!

Alain