naksyn / PythonMemoryModule

pure-python implementation of MemoryModule technique to load dll and unmanaged exe entirely from memory
Apache License 2.0
292 stars 45 forks source link

[Feature Request] Add _handle attribute to returned MemoryModule object #5

Closed rkbennett closed 9 months ago

rkbennett commented 9 months ago

There are some instances I've come across where you require the handle of the module that's been loaded into memory. Exposing a _handle attribute would be fairly simple. I believe you can just add a line above line 845 (thunkrefaddr = funcrefaddr = codebase + entry_struct.FirstThunk) which just says self._handle = hmod and that should be good enough to expose the handle as a property of the MemoryModule object.

naksyn commented 9 months ago

Hi, I think that setting self._handle =hmod in the build_import_table function will likely overwrite the property every time an imported dll is loaded during the import table building process (e.g. kernel32.dll, netapi32.dll etc.). Do you need an array of all the handles of the imported dlls? In that case a saving the handles to a list and expose that as a property might be more suitable.

rkbennett commented 9 months ago

Hmm, that's true. Though list would be hard because you wouldn't know which handle went to which dll, maybe a dict?

rkbennett commented 9 months ago

Although looking at the code that function is part of the MemoryModule class, and if you follow the example on your readme, you use an instance of that class to load the module, so I actually don't think it would overwrite. I can test tomorrow though.

rkbennett commented 9 months ago

I just tested it and it behaves in a very interesting way. The handle does change. image The first _handle attribute is from _psutil_windows, and it is the same for both interestingly enough.

naksyn commented 9 months ago

the ._handle attribute is usually from a ctypes.PyDll instance. If you load a dll using ctypes.cdll or ctypes.windll you'll have that attribute, PythonMM does not expose a ._handle, so I'm not sure of the results you're getting.

immagine

immagine

In PythonMM code, within the build_import_table there is a LoadLibrary call (dlopen) for the hmod object that is returning an integer, that is the handle of the loaded module. That is why I said that hmod will be overwritten every time dlopen is called.

immagine

immagine

As you can see, hmod is indeed overwritten at every iteration. if you want to save the handles of the loaded dlls, along with their names, then a dict would be enough as you said.

naksyn commented 9 months ago

Just to make sure I correctly understood your request, you can still access the ._codebaseaddr attribute, that is the result of the virtualalloc operation to load the PE, and that is the starting address of the loaded PE. This address is also used as the "handle" for the mapped PE for other operations (free). Is that what you need? If yes, it's already available via ._codebaseaddr attribute.

immagine

rkbennett commented 9 months ago

I need to access the handle for doing an ffi.cast

naksyn commented 9 months ago

can you try using ._codebaseaddr?

rkbennett commented 9 months ago

Yeah, I can give that a try, might be a few days before I can get to it, though

rkbennett commented 9 months ago

Okay, so I looked at this a bit more and I was mistaken on what I'm actually needing to be returned (technically). So I do require a handle to the module, but this needs to be a pointer to the MEMORYMODULE object. I thought that the pythonmemorymodule.contents attribute would have what I need, however the memory location appears to keep changing when I check its value. So I'm not sure what's happening there.

rkbennett commented 9 months ago

After doing some more testing with passing a pointer the the MEMORYMODULE object, I believe that has what I'm needing. I'll go ahead and close this. Thanks for your help.

naksyn commented 9 months ago

Glad you solved this