rkbennett / pythonmemimporter

Apache License 2.0
2 stars 1 forks source link

Crash when calling PyInit #1

Closed RyanHope closed 1 month ago

RyanHope commented 3 months ago

I made a little test app based on your _memloader class. It crashes on the module() call. I never see "here2" printed. Any ideas?

import os
import argparse
import ctypes
import pythonmemorymodule

class _FuncPtr(ctypes._CFuncPtr):
    _flags_ = ctypes._FUNCFLAG_CDECL | ctypes._FUNCFLAG_PYTHONAPI
    _restype_ = ctypes.py_object

def main(args):
    with open(args.pyd, "rb") as f:
        data = f.read()

    hmem = pythonmemorymodule.MemoryModule(data=data, debug=False)
    initf = hmem.get_proc_addr(f"PyInit_{os.path.basename(args.pyd).split('.')[0]}")
    module = ctypes.cast(initf, _FuncPtr)
    if isinstance(module, _FuncPtr):
        print("here1", module)
        module = module()
        print("here2", module.__name__)

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Load a python module from memory")
    parser.add_argument("pyd", help="Path to the python module to load")
    args = parser.parse_args()

    main(args)
RyanHope commented 3 months ago

The pyd that I was testing previously with compiled with Nuitka, that failed. I made a simple pyd from a pyx files that gets farther. Now I get this error:

here1 <_FuncPtr object at 0x000001956A4CC6C0>
Traceback (most recent call last):
  File "test3.py", line 29, in <module>
    main(args)
  File "test3.py", line 22, in main
    print("here2", module.__name__)
AttributeError: 'moduledef' object has no attribute '__name__'. Did you mean: '__ne__'?
rkbennett commented 3 months ago

Have you tried importing my module instead and calling it that way?

RyanHope commented 3 months ago

I did try using your example first and I had issues with it which is why I went looking deeper. I will try again though, it was late when I was messing with this.

rkbennett commented 3 months ago

Sounds good, also which version of python are you using?

RyanHope commented 3 months ago

Python 3.8 32bit

Here is my test lib: https://github.com/MCU-Innovations/helloworld

import sys
import importlib
import importlib.util
from pythonmemimporter import _memimporter
_memimporter = _memimporter()

class memory_importer(object):
    def find_module(self, module, path=None):
        if module == "helloworld":
            return self
    def load_module(self, name):
        pass

def _get_module_content(file):
    return open(f'D:\\temp\\{file}.cp38-win32.pyd', "rb").read()

sys.meta_path.insert(0, memory_importer())

fullname = "helloworld"
fpath = "/some/fake/path/here"
spec = importlib.util.find_spec(fullname, fpath)
initname = f"PyInit_{fullname}"
mod = _memimporter.import_module(fullname, fpath, initname, _get_module_content, spec)

sys.modules[mod.__name__] = mod

exec(f"{mod.__name__} = sys.modules['{mod.__name__}']")

print(helloworld.hello_world())
python .\test5.py
DEBUG: Reserved 40960 bytes for dll at address: 0x10000000
DEBUG: Copying sections to reserved memory block.
DEBUG: Copied section no. .text to address: 0x10001000
DEBUG: Copied section no. .rdata to address: 0x10005000
DEBUG: Copied section no. .data to address: 0x10007000
DEBUG: Copied section no. .rsrc to address: 0x10008000
DEBUG: Copied section no. .reloc to address: 0x10009000
DEBUG: Checking for base relocations.
DEBUG: Building import table.
DEBUG: codebase:0x10000000
DEBUG: Found importdesc at address: 0x10005e60
DEBUG: Found imported DLL, python38.dll. Loading..
DEBUG: Found import by name entry PyModule_NewObject , at address 0x1000506c
DEBUG: Resolved import PyModule_NewObject at address 0x6578af30
DEBUG: Found import by name entry PyCode_NewWithPosOnlyArgs , at address 0x10005070
DEBUG: Resolved import PyCode_NewWithPosOnlyArgs at address 0x65755020
DEBUG: Found import by name entry _Py_Dealloc , at address 0x10005074
DEBUG: Resolved import _Py_Dealloc at address 0x65790830
DEBUG: Found import by name entry PyTuple_GetItem , at address 0x10005078
DEBUG: Resolved import PyTuple_GetItem at address 0x6579ed00
DEBUG: Found import by name entry PyImport_GetModuleDict , at address 0x1000507c
DEBUG: Resolved import PyImport_GetModuleDict at address 0x65834580
DEBUG: Found import by name entry PyModule_GetDict , at address 0x10005080
DEBUG: Resolved import PyModule_GetDict at address 0x6578b8b0
DEBUG: Found import by name entry PyObject_Free , at address 0x10005084
DEBUG: Resolved import PyObject_Free at address 0x65791a80
DEBUG: Found import by name entry PyErr_ExceptionMatches , at address 0x10005088
DEBUG: Resolved import PyErr_ExceptionMatches at address 0x65823e80
DEBUG: Found import by name entry PyObject_GC_Del , at address 0x1000508c
DEBUG: Resolved import PyObject_GC_Del at address 0x656c6040
DEBUG: Found import by name entry PyObject_ClearWeakRefs , at address 0x10005090
DEBUG: Resolved import PyObject_ClearWeakRefs at address 0x657df990
DEBUG: Found import by name entry PyObject_Not , at address 0x10005094
DEBUG: Resolved import PyObject_Not at address 0x6578efb0
DEBUG: Found import by name entry PyUnicode_AsUTF8 , at address 0x10005098
DEBUG: Resolved import PyUnicode_AsUTF8 at address 0x657c1d10
DEBUG: Found import by name entry PyUnicode_FromFormat , at address 0x1000509c
DEBUG: Resolved import PyUnicode_FromFormat at address 0x657c02a0
DEBUG: Found import by name entry PyList_New , at address 0x100050a0
DEBUG: Resolved import PyList_New at address 0x65776e60
DEBUG: Found import by name entry PyImport_AddModule , at address 0x100050a4
DEBUG: Resolved import PyImport_AddModule at address 0x658359c0
DEBUG: Found import by name entry PyType_Ready , at address 0x100050a8
DEBUG: Resolved import PyType_Ready at address 0x657a9890
DEBUG: Found import by name entry PyObject_GetAttrString , at address 0x100050ac
DEBUG: Resolved import PyObject_GetAttrString at address 0x6578e070
DEBUG: Found import by name entry PyErr_Clear , at address 0x100050b0
DEBUG: Resolved import PyErr_Clear at address 0x65824170
DEBUG: Found import by name entry PyUnicode_Decode , at address 0x100050b4
DEBUG: Resolved import PyUnicode_Decode at address 0x657c0880
DEBUG: Found import by name entry _PyObject_GenericGetAttrWithDict , at address 0x100050b8
DEBUG: Resolved import _PyObject_GenericGetAttrWithDict at address 0x6578ea20
DEBUG: Found import by name entry _Py_FalseStruct , at address 0x100050bc
DEBUG: Resolved import _Py_FalseStruct at address 0x659ed160
DEBUG: Found import by name entry PyDict_New , at address 0x100050c0
DEBUG: Resolved import PyDict_New at address 0x6575cd40
DEBUG: Found import by name entry _PyDict_GetItem_KnownHash , at address 0x100050c4
DEBUG: Resolved import _PyDict_GetItem_KnownHash at address 0x6575df20
DEBUG: Found import by name entry PyMem_Free , at address 0x100050c8
DEBUG: Resolved import PyMem_Free at address 0x65791890
DEBUG: Found import by name entry PyErr_NoMemory , at address 0x100050cc
DEBUG: Resolved import PyErr_NoMemory at address 0x65824690
DEBUG: Found import by name entry PyDict_GetItemString , at address 0x100050d0
DEBUG: Resolved import PyDict_GetItemString at address 0x65761b00
DEBUG: Found import by name entry PyModuleDef_Init , at address 0x100050d4
DEBUG: Resolved import PyModuleDef_Init at address 0x6578ad40
DEBUG: Found import by name entry PyObject_GC_Track , at address 0x100050d8
DEBUG: Resolved import PyObject_GC_Track at address 0x656c5c80
DEBUG: Found import by name entry PyBytes_FromStringAndSize , at address 0x100050dc
DEBUG: Resolved import PyBytes_FromStringAndSize at address 0x657464e0
DEBUG: Found import by name entry PyExc_TypeError , at address 0x100050e0
DEBUG: Resolved import PyExc_TypeError at address 0x659efd9c
DEBUG: Found import by name entry PyMem_Realloc , at address 0x100050e4
DEBUG: Resolved import PyMem_Realloc at address 0x65791860
DEBUG: Found import by name entry PyExc_NameError , at address 0x100050e8
DEBUG: Resolved import PyExc_NameError at address 0x659f0c64
DEBUG: Found import by name entry PyTuple_Pack , at address 0x100050ec
DEBUG: Resolved import PyTuple_Pack at address 0x6579ef60
DEBUG: Found import by name entry PyMem_Malloc , at address 0x100050f0
DEBUG: Resolved import PyMem_Malloc at address 0x657917f0
DEBUG: Found import by name entry PyExc_ImportError , at address 0x100050f4
DEBUG: Resolved import PyExc_ImportError at address 0x659f35e8
DEBUG: Found import by name entry _Py_TrueStruct , at address 0x100050f8
DEBUG: Resolved import _Py_TrueStruct at address 0x659ed150
DEBUG: Found import by name entry PyExc_SystemError , at address 0x100050fc
DEBUG: Resolved import PyExc_SystemError at address 0x659efdc8
DEBUG: Found import by name entry _PyObject_GC_New , at address 0x10005100
DEBUG: Resolved import _PyObject_GC_New at address 0x656c5f00
DEBUG: Found import by name entry PyUnicode_FromString , at address 0x10005104
DEBUG: Resolved import PyUnicode_FromString at address 0x657bec00
DEBUG: Found import by name entry PyObject_Call , at address 0x10005108
DEBUG: Resolved import PyObject_Call at address 0x65750a20
DEBUG: Found import by name entry PyUnicode_FromStringAndSize , at address 0x1000510c
DEBUG: Resolved import PyUnicode_FromStringAndSize at address 0x657beba0
DEBUG: Found import by name entry _PyObject_GetDictPtr , at address 0x10005110
DEBUG: Resolved import _PyObject_GetDictPtr at address 0x6578e7f0
DEBUG: Found import by name entry PyErr_Format , at address 0x10005114
DEBUG: Resolved import PyErr_Format at address 0x65824f80
DEBUG: Found import by name entry PyDict_Next , at address 0x10005118
DEBUG: Resolved import PyDict_Next at address 0x6575e720
DEBUG: Found import by name entry _Py_CheckRecursiveCall , at address 0x1000511c
DEBUG: Resolved import _Py_CheckRecursiveCall at address 0x65808b70
DEBUG: Found import by name entry PyErr_SetString , at address 0x10005120
DEBUG: Resolved import PyErr_SetString at address 0x65823d80
DEBUG: Found import by name entry PyTuple_GetSlice , at address 0x10005124
DEBUG: Resolved import PyTuple_GetSlice at address 0x6579f5a0
DEBUG: Found import by name entry PyExc_AttributeError , at address 0x10005128
DEBUG: Resolved import PyExc_AttributeError at address 0x659efdc0
DEBUG: Found import by name entry PyDict_Size , at address 0x1000512c
DEBUG: Resolved import PyDict_Size at address 0x65760a60
DEBUG: Found import by name entry PyDict_SetItemString , at address 0x10005130
DEBUG: Resolved import PyDict_SetItemString at address 0x65761bd0
DEBUG: Found import by name entry PyTuple_New , at address 0x10005134
DEBUG: Resolved import PyTuple_New at address 0x6579ebb0
DEBUG: Found import by name entry _Py_NoneStruct , at address 0x10005138
DEBUG: Resolved import _Py_NoneStruct at address 0x659f5cf8
DEBUG: Found import by name entry PyObject_GetAttr , at address 0x1000513c
DEBUG: Resolved import PyObject_GetAttr at address 0x6578e420
DEBUG: Found import by name entry Py_GetVersion , at address 0x10005140
DEBUG: Resolved import Py_GetVersion at address 0x658305f0
DEBUG: Found import by name entry PyInterpreterState_GetID , at address 0x10005144
DEBUG: Resolved import PyInterpreterState_GetID at address 0x65851190
DEBUG: Found import by name entry PyObject_Hash , at address 0x10005148
DEBUG: Resolved import PyObject_Hash at address 0x6578e000
DEBUG: Found import by name entry PyObject_GC_UnTrack , at address 0x1000514c
DEBUG: Resolved import PyObject_GC_UnTrack at address 0x656c5ce0
DEBUG: Found import by name entry PyObject_SetAttrString , at address 0x10005150
DEBUG: Resolved import PyObject_SetAttrString at address 0x6578e150
DEBUG: Found import by name entry PyMethod_New , at address 0x10005154
DEBUG: Resolved import PyMethod_New at address 0x65753960
DEBUG: Found import by name entry PyExc_RuntimeError , at address 0x10005158
DEBUG: Resolved import PyExc_RuntimeError at address 0x659efda0
DEBUG: Found import by name entry _PyThreadState_UncheckedGet , at address 0x1000515c
DEBUG: Resolved import _PyThreadState_UncheckedGet at address 0x65852080
DEBUG: Found import by name entry PyTraceBack_Here , at address 0x10005160
DEBUG: Resolved import PyTraceBack_Here at address 0x6587a4c0
DEBUG: Found import by name entry PyObject_GenericGetAttr , at address 0x10005164
DEBUG: Resolved import PyObject_GenericGetAttr at address 0x6578ec70
DEBUG: Found import by name entry PyErr_Occurred , at address 0x10005168
DEBUG: Resolved import PyErr_Occurred at address 0x65823da0
DEBUG: Found import by name entry PyImport_ImportModuleLevelObject , at address 0x1000516c
DEBUG: Resolved import PyImport_ImportModuleLevelObject at address 0x658373d0
DEBUG: Found import by name entry PyFrame_New , at address 0x10005170
DEBUG: Resolved import PyFrame_New at address 0x65770620
DEBUG: Found import by name entry PyExc_RuntimeWarning , at address 0x10005174
DEBUG: Resolved import PyExc_RuntimeWarning at address 0x659f35ec
DEBUG: Found import by name entry PyErr_WarnEx , at address 0x10005178
DEBUG: Resolved import PyErr_WarnEx at address 0x657f00d0
DEBUG: Found import by name entry PyErr_GivenExceptionMatches , at address 0x1000517c
DEBUG: Resolved import PyErr_GivenExceptionMatches at address 0x65823db0
DEBUG: Found import by name entry PyCode_NewEmpty , at address 0x10005180
DEBUG: Resolved import PyCode_NewEmpty at address 0x65755620
DEBUG: Found import by name entry _Py_CheckRecursionLimit , at address 0x10005184
DEBUG: Resolved import _Py_CheckRecursionLimit at address 0x659f837c
DEBUG: Found import by name entry PyThreadState_Get , at address 0x10005188
DEBUG: Resolved import PyThreadState_Get at address 0x65852090
DEBUG: Found import by name entry PyOS_snprintf , at address 0x1000518c
DEBUG: Resolved import PyOS_snprintf at address 0x65845140
DEBUG: Found import by name entry PyUnicode_InternFromString , at address 0x10005190
DEBUG: Resolved import PyUnicode_InternFromString at address 0x657dc420
DEBUG: Found import by name entry PyObject_SetAttr , at address 0x10005194
DEBUG: Resolved import PyObject_SetAttr at address 0x6578e6a0
DEBUG: Found import by name entry PyDict_SetItem , at address 0x10005198
DEBUG: Resolved import PyDict_SetItem at address 0x6575e150
DEBUG: Found import by name entry PyBaseObject_Type , at address 0x1000519c
DEBUG: Resolved import PyBaseObject_Type at address 0x659f7b88
DEBUG: Found importdesc at address: 0x10005e60
DEBUG: Found imported DLL, KERNEL32.dll. Loading..
DEBUG: Found import by name entry UnhandledExceptionFilter , at address 0x10005000
DEBUG: Resolved import UnhandledExceptionFilter at address 0x76824f00
DEBUG: Found import by name entry SetUnhandledExceptionFilter , at address 0x10005004
DEBUG: Resolved import SetUnhandledExceptionFilter at address 0x768119c0
DEBUG: Found import by name entry GetCurrentProcess , at address 0x10005008
DEBUG: Resolved import GetCurrentProcess at address 0x76813120
DEBUG: Found import by name entry TerminateProcess , at address 0x1000500c
DEBUG: Resolved import TerminateProcess at address 0x76809910
DEBUG: Found import by name entry IsProcessorFeaturePresent , at address 0x10005010
DEBUG: Resolved import IsProcessorFeaturePresent at address 0x76810e10
DEBUG: Found import by name entry QueryPerformanceCounter , at address 0x10005014
DEBUG: Resolved import QueryPerformanceCounter at address 0x7680e1e0
DEBUG: Found import by name entry GetCurrentProcessId , at address 0x10005018
DEBUG: Resolved import GetCurrentProcessId at address 0x76813130
DEBUG: Found import by name entry GetCurrentThreadId , at address 0x1000501c
DEBUG: Resolved import GetCurrentThreadId at address 0x7680e1b0
DEBUG: Found import by name entry GetSystemTimeAsFileTime , at address 0x10005020
DEBUG: Resolved import GetSystemTimeAsFileTime at address 0x7680f630
DEBUG: Found import by name entry DisableThreadLibraryCalls , at address 0x10005024
DEBUG: Resolved import DisableThreadLibraryCalls at address 0x76811bd0
DEBUG: Found import by name entry InitializeSListHead , at address 0x10005028
DEBUG: Resolved import InitializeSListHead at address 0x774f8c10
DEBUG: Found import by name entry IsDebuggerPresent , at address 0x1000502c
DEBUG: Resolved import IsDebuggerPresent at address 0x76812370
DEBUG: Found importdesc at address: 0x10005e60
DEBUG: Found imported DLL, VCRUNTIME140.dll. Loading..
DEBUG: Found import by name entry __std_type_info_destroy_list , at address 0x10005034
DEBUG: Resolved import __std_type_info_destroy_list at address 0x6c7173f0
DEBUG: Found import by name entry _except_handler4_common , at address 0x10005038
DEBUG: Resolved import _except_handler4_common at address 0x6c713ef0
DEBUG: Found import by name entry memset , at address 0x1000503c
DEBUG: Resolved import memset at address 0x6c7137a0
DEBUG: Found import by name entry strrchr , at address 0x10005040
DEBUG: Resolved import strrchr at address 0x6c713a30
DEBUG: Found importdesc at address: 0x10005e60
DEBUG: Found imported DLL, api-ms-win-crt-runtime-l1-1-0.dll. Loading..
DEBUG: Found import by name entry _initterm , at address 0x10005048
DEBUG: Resolved import _initterm at address 0x7652ac10
DEBUG: Found import by name entry _initterm_e , at address 0x1000504c
DEBUG: Resolved import _initterm_e at address 0x7652ac60
DEBUG: Found import by name entry _seh_filter_dll , at address 0x10005050
DEBUG: Resolved import _seh_filter_dll at address 0x7659cc70
DEBUG: Found import by name entry _configure_narrow_argv , at address 0x10005054
DEBUG: Resolved import _configure_narrow_argv at address 0x765409d0
DEBUG: Found import by name entry _initialize_narrow_environment , at address 0x10005058
DEBUG: Resolved import _initialize_narrow_environment at address 0x76546a30
DEBUG: Found import by name entry _initialize_onexit_table , at address 0x1000505c
DEBUG: Resolved import _initialize_onexit_table at address 0x7652a920
DEBUG: Found import by name entry _execute_onexit_table , at address 0x10005060
DEBUG: Resolved import _execute_onexit_table at address 0x7652a950
DEBUG: Found import by name entry _cexit , at address 0x10005064
DEBUG: Resolved import _cexit at address 0x765a0d80
DEBUG: Finalizing sections.
DEBUG: Found 5 total sections.
DEBUG: Section n. 0
DEBUG: size=13312
DEBUG: execute 1
DEBUG: read 1
DEBUG: write 0
DEBUG: Protection flag:32
DEBUG: physaddr:0x10001000
DEBUG: Section n. 1
DEBUG: size=6656
DEBUG: execute 0
DEBUG: read 1
DEBUG: write 0
DEBUG: Protection flag:2
DEBUG: physaddr:0x10005000
DEBUG: Section n. 2
DEBUG: size=1024
DEBUG: execute 0
DEBUG: read 1
DEBUG: write 1
DEBUG: Protection flag:4
DEBUG: physaddr:0x10007000
DEBUG: Section n. 3
DEBUG: size=512
DEBUG: execute 0
DEBUG: read 1
DEBUG: write 0
DEBUG: Protection flag:2
DEBUG: physaddr:0x10008000
DEBUG: Section n. 4
DEBUG: size=1536
DEBUG: execute 0
DEBUG: read 1
DEBUG: write 0
DEBUG: physaddr:0x10009000
DEBUG: Executing TLS.
DEBUG: no TLS address found
DEBUG: Starting new thread to execute PE
DEBUG: Checking for entry point.
Traceback (most recent call last):
  File ".\test5.py", line 25, in <module>
DEBUG: Calling dll entrypoint 0x100039cc with DLL_PROCESS_ATTACH
    sys.modules[mod.__name__] = mod
AttributeError: 'moduledef' object has no attribute '__name__'
RyanHope commented 3 months ago

I compiled the helloworld lib for python 3.11 64bit, fails in the same way.

rkbennett commented 3 months ago

Try swapping the mod.__name__ to fullname

RyanHope commented 3 months ago

I replaced all the mod.name with fullname

...
DEBUG: execute 0
DEBUG: read 1
DEBUG: write 0
DEBUG: physaddr:0x10009000
DEBUG: Executing TLS.
DEBUG: no TLS address found
DEBUG: Starting new thread to execute PE
DEBUG: Checking for entry point.
DEBUG: Calling dll entrypoint 0x100039cc with DLL_PROCESS_ATTACH
Traceback (most recent call last):
  File ".\test5.py", line 29, in <module>
    print(helloworld.hello_world())
AttributeError: 'moduledef' object has no attribute 'hello_world'
rkbennett commented 3 months ago

Could you try it on 3.10 and see if it has the same error?

RyanHope commented 3 months ago

Also get "AttributeError: 'moduledef' object has no attribute 'hello_world'" on 3.10.

rkbennett commented 3 months ago

I'll have to do some research on this, my thought is that what cython is generating doesn't behave the same way as a true python c extension, because it shouldn't return a module def when you execute the init function.

rkbennett commented 3 months ago

Would you be willing to try it with od_import? Just to make sure the import hooks is good.

RyanHope commented 3 months ago

I'll have to do some research on this, my thought is that what cython is generating doesn't behave the same way as a true python c extension, because it shouldn't return a module def when you execute the init function.

This is probably related to why Nuitka build pyd's also don't work. This project would be very useful to me if I could get it working so very willing to do whatever testing you need me to do.

rkbennett commented 2 months ago

What version of python did you use with cython to generate your pyd?

rkbennett commented 2 months ago

So after a bit of research, I believe I've found the solution. I think pyds compiled with cython (and likely nuitka) use multi-phase initialization. I've updated the example in the readme to handle that if you want to try again.

rkbennett commented 2 months ago

@RyanHope I've updated the module itself now and fixed a bug where there would be a heap corruption when exiting python if a module with multi-phase initialization had been loaded. Can you verify this fix works for you, so I can close this issue? I've tested locally and it worked.