ocaml / flexdll

a dlopen-like API for Windows
Other
103 stars 31 forks source link

contrib - visual studio 2015 integrated example #70

Open zeromus opened 6 years ago

zeromus commented 6 years ago

dlltest10.zip

Here I've made a visual studio 2015 integrated example of using flexdll. Open the sln, set startup project to exe, and set debug working directory to $(ProjectDir).run

You will now get working builds for debug/release & x86/x64. A working example prints this:

in main
99
now: 0

The project includes the latest master flexlink.exe build as of now (needed for x64 debugging reloc fixes); as well, I've had to rebuild the flexdll_*.obj using vs2015 toolchain (scripts for doing this are provided)

The project also contains my program linkwrap which I think was required to get the nice VS/msbuild building semantics I wanted. Chiefly, it rips apart the response file provided to msvc's link.exe from VS and analyzes it to determine wordsize and lib/obj dependencies for passing to flexdll's flexlink.exe

The vcxprojs have been modified simply to incorporate flexdll as an improved linker, by setting these things in the project:

<PropertyGroup Label="Globals">
  <LinkToolPath>$(ProjectDir)flexdll</LinkToolPath>
  <LinkToolExe>linkwrap.exe</LinkToolExe>
</PropertyGroup>

Note: you cannot use link-time code generation with flexdll. It's on by default in vcxprojs; I have turned it off. This should come as no great surprise as it probably incorporates a great deal of inscrutable proprietary junk (the problem will manifest as some kind of a premature EOF error while reading a .obj input)

The test confirms the main functionality I was interested in:

  1. a dll can leave an unresolved external to a global variable in the main program
  2. the dll can export functions without dllexport
  3. the global variable is correctly shared between the modules
  4. I can debug from the main and into the dll and back out

TODO - I haven't tested this with additional library directories. I will probably need to update it for that. I just need to get a WIP committed, so that's what you're seeing here.

alainfrisch commented 6 years ago

That's great!

What about hosting the full example as new a project on e.g. GitHub? I will then add a link to it from flexdll documentation.

zeromus commented 6 years ago

Maybe... I don't know... I have more research to do first. I can't get my project to work right now. Some relocations are editing the PE header, seemingly due to not being offset by the position of the section (so the very small addresses within the section are used as absolute within the module image instead, making them on top of the PE header).

My enthusiasm will rapidly deflate if I can't finally use this for what I need

You're not in any chat room where we can collaborate, are you?

I've been unable to decipher how these pointers are offset from the raw values taken from the object relocation entries so I can't figure out why, but something's causing sections with a certain structure to malfunction. I think the busted sections are all for certain comdats which are inline functions from the exe module, or certain types declared in the exe module (i have one type with three basically identical methods in it that is causing three of these strange relocations)

I've tried /OPT:NOREF, disabling comdat folding, disabling incremental linking.. you name it. anything that could do something wacky to confuse flexdll. no luck. to me it looks like a hash collision kind of thing. On the function body? I just check two of those three functions mentioned in the prior paragraph and they have the same "RAW DATA" from dumpbin. Couldnt find a hash like that in flexlink. Still I'm suspicious MS's LINK is merging them somehow. I just checked several obj files "RAW DATA" for an offending section and it was all identical.

alainfrisch commented 6 years ago

Are you being to produce a standalone example, preferably using a simple build script?

zeromus commented 6 years ago

Unfortunately not so far. I'm going to have to transplant huge chunks of code. This could take a couple of days, and may never work :/ Unfortunately this project is a nice-to-have for me and I've spent nearly about all the time I can justify on it for now. But I would totally let you teamview to my PC and debug it with me.

All I have is this cool screen shot.

stompy relocations 2 stompy relocations

zeromus commented 6 years ago

Oh, but I do have one question. Why are functions beginning with ?? eliminated? It means all the operators and constructors don't make it...... I had to disable that elimination logic. I don't know what the consequences are, but I expected them to be more numerous than the problem I'm having here (which is affecting about 8 out of 1000s of functions)

alainfrisch commented 6 years ago

This was introduced very early (commit f63184b864b33b8e9a4c90f022301580df712f97) when porting to x64, but unfortunately I don't remember the details. (If you remove this logic, however, make sure to do something with the synthesized ??flexrefptr%i symbols, which rely on this dropping convention.)

zeromus commented 6 years ago

I guess I should have mentioned I have the same problems (unsure if identical data) on x64 and x86 both. Could still be due to my having ignored flexrefptr because I didnt understand it and it sounded like it was for cygwin and not msvc. I dont see refptr anywhere in my build directories. As a sanity check, I found it in some msys libs I had laying around (not being used in the project) so I know how to find them. Don't think that's the problem.

zeromus commented 6 years ago

SO.. uhh.. I tried using the internal linker and it couldnt find kernel32.lib. I guess I can set up some include directories. I see create_dll only gets used by the internal linker. I can't fully figure out where the flexdll's custom relocations table is written out in its final form, when not using the internal linker. In reloc.ml I see the "kind" go out via int_to_buf, then the name, and then finally the target address for the relocation (which is what's pointing in my PE header). But I printed rel.addr there and nothing was relative to the image base.. and based on my analysis, that's just the position copied from the object file's section-relative address. So where does the image base address and the position of the section get incorporated? I guess those get relocated by the windows DLL loader? It seems like you would need a relocation table for your relocations (but maybe your relocations list is in the right configuration for that). Huh, why havent I thought to check the contents of that section in the dll until now..

Yeah, see, here's your table inside the dll. theyre all rel32 relocations (0x00000001) then the string table pointer (0x30128462) then the offending weird address (0x30000012). So I have no idea where the base address and section location are getting incorporated. Probably deep in the magic guts of your lazy buffer writer. image

If you could make a little hack for your code that detected any time a very tiny address (less than 100 is what I'm using) gets emitted (but as early as possible), I might could use that to chase down the cause of it inside my codebase.

Alas its 6:30 am and I think I've got to drop this for the indefinite future unless you have some help for me

zeromus commented 6 years ago

Had a wild idea. If I can contact you privately, I can supply a test case consisting of the obj files and scripts to produce a .dll which has the offending weird address circled above in it. Then you can debug flexlink to find out why it's getting there, without needing the sources or to run the built product.