python / cpython

The Python programming language
https://www.python.org/
Other
61.26k stars 29.54k forks source link

2.6.1 breaks many applications that embed Python on Windows #48816

Closed 7e6c45ab-cd55-40b9-9429-49bb488ca339 closed 15 years ago

7e6c45ab-cd55-40b9-9429-49bb488ca339 commented 15 years ago
BPO 4566
Nosy @loewis, @mhammond, @tiran
Files
  • testpy.c
  • httpd.exe.manifest
  • bug4566.patch: updated patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/mhammond' closed_at = created_at = labels = ['type-bug', 'OS-windows'] title = '2.6.1 breaks many applications that embed Python on Windows' updated_at = user = 'https://bugs.python.org/craigh' ``` bugs.python.org fields: ```python activity = actor = 'shooshx' assignee = 'mhammond' closed = True closed_date = closer = 'mhammond' components = ['Windows'] creation = creator = 'craigh' dependencies = [] files = ['12249', '12250', '12549'] hgrepos = [] issue_num = 4566 keywords = ['patch', 'needs review'] message_count = 29.0 messages = ['77133', '77136', '77142', '77181', '77183', '77211', '77217', '77232', '78732', '78751', '78752', '78755', '78757', '78762', '78765', '78778', '78779', '78811', '78819', '78839', '78890', '79458', '79491', '80116', '80462', '80675', '80680', '88016', '161272'] nosy_count = 11.0 nosy_names = ['loewis', 'mhammond', 'ggenellina', 'christian.heimes', 'schmir', 'craigh', 'johnshue', 'rantanen', 'SandyBarbour', 'chrisyco', 'shooshx'] pr_nums = [] priority = 'normal' resolution = 'fixed' stage = None status = 'closed' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue4566' versions = ['Python 2.6'] ```

    7e6c45ab-cd55-40b9-9429-49bb488ca339 commented 15 years ago

    Applications on Windows that don't link to the MSVCR90.DLL via a manifest are broken with Python 2.6.1. This includes apps that link with the C library statically and apps that link with other versions of it. These applications worked fine with Python 2.6.

    To test this, I created a simple application that did nothing but call Py_Main (compiled with VS2008). When that app links with the C library statically, attempting to import the socket module gives this error:

    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "C:\Python26\lib\socket.py", line 46, in <module>
        import _socket
    ImportError: DLL load failed: The specified module could not be found.

    When that app links with the C library dynamically, it works properly.

    In particular, this issue breaks mod_python. From the Apache error log:

    [Sat Dec 06 00:49:21 2008] [error] make_obcallback: could not import
    mod_python.apache.\n
    Traceback (most recent call last):
      File "C:\Python26\lib\site-packages\mod_python\apache.py", line 29, in
    <module>
        import cgi
      File "C:\Python26\Lib\cgi.py", line 40, in <module>
        import urllib
      File "C:\Python26\Lib\urllib.py", line 26, in <module>
        import socket
      File "C:\Python26\lib\socket.py", line 46, in <module>
        import _socket
    ImportError: DLL load failed: A dynamic link library (DLL)
    initialization routine failed.

    I'm guessing this is a side-effect of the fix for bpo-4120. Since modules like _socket.pyd don't have manifests that tell Windows where to find the side-by-side assembly to load, and the application loading _socket.pyd doesn't have it in its manifest either, Windows has no clue where to find MSVCR90.dll.

    Something I discovered (by accident) is that, if there's an external manifest in the same folder as the hosting application (ie, testpy.exe.manifest) that points to the C runtime, it fixes the problem, even if the host app links statically to the C runtime. I don't recall off-hand whether these external manifests take precedence over the internal ones, but in this case it seems to provide enough information to Windows that it can resolve the dependency.

    7e6c45ab-cd55-40b9-9429-49bb488ca339 commented 15 years ago

    I've attached the test program I was using. The commands I used to compile it are:

    cl /MT /c /IC:\Python26\include testpy.c link /LIBPATH:C:\Python26\libs testpy.obj

    7e6c45ab-cd55-40b9-9429-49bb488ca339 commented 15 years ago

    I've attached a manifest file that references the C runtime; adding this file to the Apache bin folder seems to workaround this issue.

    61337411-43fc-4a9c-b8d5-4060aede66d0 commented 15 years ago

    I would claim that this is not a bug. There is simpler no other way to deal with this entire SxS mess.

    tiran commented 15 years ago

    A mess it is :/

    7e6c45ab-cd55-40b9-9429-49bb488ca339 commented 15 years ago

    I understand the rationale behind bpo-4120, but it seems like it only helps a narrow set of applications, namely "applications that link dynamically with the same version of MSVCR90 as Python and that bundle the MSVCR90 DLL and that can't install the VS2008 redist". In 2.6.1 those apps don't have to install the VS2008 redist, but every other app needs to either bundle the runtime DLLs (as a private assembly) or use the manifest workaround I described above, even if the VS2008 redist is installed on the system.

    The 2.6.0 behavior - requiring the VS2008 redist to be installed - is hardly perfect (to put it mildly), but in my opinion it's more obvious and straightforward, and more consistent with the behavior of other Windows software.

    61337411-43fc-4a9c-b8d5-4060aede66d0 commented 15 years ago

    I understand the rationale behind bpo-4120, but it seems like it only helps a narrow set of applications, namely "applications that link dynamically with the same version of MSVCR90 as Python and that bundle the MSVCR90 DLL and that can't install the VS2008 redist".

    This is not a narrow set, though. It includes all the applications that use py2exe and friends to create stand-alone applications. This case absolutely must be supported.

    The 2.6.0 behavior - requiring the VS2008 redist to be installed - is hardly perfect (to put it mildly), but in my opinion it's more obvious and straightforward, and more consistent with the behavior of other Windows software.

    Python has a long tradition of supporting "xcopy deployment". I don't want Microsoft to dictate that we stop supporting that. I find the need to have end-users install the CRT redist particularly unacceptable.

    I don't quite understand this issue yet. python26.dll is linked with a manifest. Isn't that good enough?

    7e6c45ab-cd55-40b9-9429-49bb488ca339 commented 15 years ago

    I don't quite understand this issue yet. python26.dll is linked with a manifest. Isn't that good enough?

    Apparently not, at least in my testing. It seems that if a DLL is loaded, Windows will try to resolve its dependencies by looking at that DLL's own manifest; if that fails, Windows will try to resolve them by looking at the EXE's manifest. It won't check other DLLs. From the loader's perspective, it seems like there's no "tree" of DLLs, it's just "EXE loads DLL, EXE loads DLL", even if the loading code is actually in one of the DLLs.

    I guess I'm more concerned about applications like Apache that only use Python through an external module or plugin; there's no reason the Apache developers would be expected to make this kind of change with the manifests and everything, much less any commercial, closed-source app (say, an IDE or editor that has plugins for several scripting languages). Also, the manifest trick I described as a workaround was only as simple as it was in this case because httpd.exe didn't have any manifest at all; if it already had an internal manifest, working-around this would have been much more gruesome.

    I understand, though, the value of "xcopy deployment", and I realize that having some means of doing that is better than none at all.

    7e6c45ab-cd55-40b9-9429-49bb488ca339 commented 15 years ago

    Here's an option, though unfortunately not a trivial one: use a private build of the C runtime. The Windows version of Firefox does this (mozcrt19.dll). The private CRT build doesn't use SxS in any way, so it gets around this issue, as well as other issues like not allowing "for me" installs on Vista (bpo-4018).

    For me to build the CRT from the source included with Visual Studio 2008 took some tweaking - apparently having the CRT source build properly out of the box wasn't a priority for MS (the Mozilla CRT seems to be built from the VS2005 source; perhaps that version is more cooperative). It does yield a CRT that links like an ordinary DLL, but the fact that it can't be built without modifications is a major drawback to this solution.

    Also, I'd assume that the CRT source isn't included in the Express version (and possibly other ones), so that's a downside, too.

    As far as Python itself, the project configurations would have to be changed to define _CRT_NOFORCE_MANIFEST, suppress the default linking to the stock CRT (/NODEFAULTLIB:msvcrt.lib), and link to the import library for the private CRT. I might try to create an experimental solution config for this.

    mhammond commented 15 years ago

    I've no time to dig deeper now as I suspect testing will require removal of the vc9 assembly from the GAC and testing with a local one, but some comments:

    test.c's error is "can't find the DLL" - this will be as we attempt to load Python's DLL - but this isn't the same as the original error, which is "DLL init routine failed". To repro the initial error, I suspect you will want to put the full assembly next to test.exe - that will allow python.dll to load - then test.c should call PyExec_EvalString("import socket\n") - it is at *that* point the error we care about is likely to be thrown.

    That specific error code means the DLL init routine in the CRT started executing, but explicitly threw an error. Its possible to use the debugger to see exactly when this is thrown, and it relates to the assembly configuration (the details escape me). I suspect it will be necessary to locate this CRT init code to determine exactly why it is throwing the error, and possibly determine how to satisfy it (the DLL *is* loaded, so it should be capable of working)

    mhammond commented 15 years ago

    I meant to mention: FWIW, *some* py2exe apps work fine with the old scheme - specifically, IIUC, any app will work fine so long as the .pyd files were next to the executable, which is next to the assembly. I understand this is a significant restriction, but its worth mentioning for completeness.

    7e6c45ab-cd55-40b9-9429-49bb488ca339 commented 15 years ago

    test.c's error is "can't find the DLL" - this will be as we attempt to load Python's DLL - but this isn't the same as the original error, which is "DLL init routine failed". To repro the initial error, I suspect you will want to put the full assembly next to test.exe - that will allow python.dll to load - then test.c should call PyExec_EvalString("import socket\n") - it is at *that* point the error we care about is likely to be thrown.

    The test program isn't having a problem loading python26.dll - it gets to an interpreter prompt and it can execute simple Python statements. It doesn't throw ImportError until the user types in "import socket". Further, I can see in Process Explorer that python26.dll is loaded in the running testpy.exe process. I apologize if my initial description wasn't clear on that point.

    I do see the error codes are different (between testpy.c and mod_python), but both are triggered by trying to load _socket.pyd.

    Nonetheless: if I copy the CRT assembly into the same folder as testpy.exe, I get a popup dialog when I try to import socket, saying "Runtime Error!" and "R6034; An application has made an attempt to load the C library incorrectly.". When I click OK in that dialog, I get this error in the console running testpy.exe:

    ImportError: DLL load failed: A dynamic link library (DLL) initialization routine failed.

    (If the assembly is not in the folder, there is no popup dialog at all.) So it does indeed change the error code received.

    Also, I see now that I made a mistake in reporting the error code from the Apache log. The actual behavior is this:

    Putting the CRT assembly in the Apache bin folder or Apache modules folder still gives the first error (module could not be found).

    I apologize for the confusion; when I first experienced this problem I tried to fix it by experimenting with putting the manifest in various folders and I wasn't paying close enough attention to what error was given when.

    To summarize: testpy.exe with CRT assembly in testpy.exe folder: init routine failed and popup. testpy.exe with CRT assembly in _socket.pyd folder: init routine failed and popup. testpy.exe otherwise: module could not be found

    Apache with CRT assembly in _socket.pyd folder: init routine failed and popup. Apache otherwise: module could not be found

    tiran commented 15 years ago

    Mark Hammond \mhammond@users.sourceforge.net\ added the comment:

    I've no time to dig deeper now as I suspect testing will require removal of the vc9 assembly from the GAC and testing with a local one, but some comments:

    Isn't the GAC just for .NET assemblies while the SxS cache is for non .NET assemblies? Microsoft's naming schema is confusing as usually. ;)

    Christian

    7e6c45ab-cd55-40b9-9429-49bb488ca339 commented 15 years ago

    I took a look at this with the debugger, as Mark recommended. The CRT's DLLMain is called _CRTDLLINIT, that in turn calls \_CRTDLLINIT. \_CRTDLL_INIT calls another function, _check_manifest.

    _check_manifest calls an SxS function called FindActCtxSectionString. It's looking for a string called "msvcr90.dll" in the ACTIVATION_CONTEXT_SECTION_DLL_REDIRECTION section of the process's activation context (activation contexts are data structures used by SxS). That call fails (_check_manifest doesn't call GetLastError or anything, it just returns FALSE right there). There's a comment there that says:

    / no activation context used to load CRT DLL, means no manifest present in the process \/

    What's bizarre is that python26.dll successfully loaded msvcr90.dll (the global one from WinSxS), so it must have passed _check_manifest. It seems like the activation context consists of the DLL's manifest (_socket.pyd's in this case) and the exe's manifest, but no other ones, regardless of what other libraries have been loaded. The documentation doesn't seem to explain the interaction between manifests in different modules. It's also annoying that this restriction (the _check_manifest call) is completely artificial.

    I don't know if any of this information is useful (I'm only superficially familiar with activation contexts and the functions to manipulate them). The code is all in crtlib.c (in the CRT source) if someone else wants to take a look.

    mhammond commented 15 years ago

    I've hacked together something that fixes the problem. I'm working on making it a real patch, but the basis is:

    This makes the "import socket" test case work for me. I'm not sure what the impact will be should the .pyd file reference other assemblies. Also, the calls above all need to move to function pointer calls as the functions only exist on Vista and later (hence the slight delay in making a real patch)

    mhammond commented 15 years ago

    Attaching a patch which works for me against python 2.6. Only ever tested on Vista (ie, where the function pointers etc all load)

    mhammond commented 15 years ago

    Attaching a new patch with some typos in the comments corrected. While I'm here, I also meant to mention (again!):

    7e6c45ab-cd55-40b9-9429-49bb488ca339 commented 15 years ago

    I haven't been able to try this patch myself yet, but I see a potential problem: the "cookie" variable is declared as a DWORD, while ActivateActCtx expects a ULONG_PTR. DWORD and ULONG_PTR are only the same thing in 32-bit Windows.

    Also, where are you seeing that these SxS functions are "Vista or later"? My XP kernel32.dll has all of them. MSDN says they're XP or Vista.

    7e6c45ab-cd55-40b9-9429-49bb488ca339 commented 15 years ago

    The patch works fine on my system (32-bit XP). Also I verified in Process Explorer that there's only one instance of msvcr90.dll loaded.

    61337411-43fc-4a9c-b8d5-4060aede66d0 commented 15 years ago

    Here's an option, though unfortunately not a trivial one: use a private build of the C runtime.

    I'm not sure whether the license allows us to do so.

    mhammond commented 15 years ago

    Uploading a corrected patch; some old docs I saw said DWORD, and I obviously neglected to fix every spot I used that, and I've changed 'Vista' to 'XP' in the patch comments.

    bd064da6-b63b-4835-b09f-98ffa16aa771 commented 15 years ago

    We encountered a problem that Windows installer created with 2.5.2 could not be installed with 2.6.1 as executing the post-installer script fails. More details about the reason can be found from http://code.google.com/p/robotframework/issues/detail?id=196. Is the problem related to this issue or should I create new issue about that. I can also provide simple project that could be used to test this out.

    61337411-43fc-4a9c-b8d5-4060aede66d0 commented 15 years ago

    Is the problem related to this issue or should I create new issue about that.

    You should assume that it is not related.

    I can also provide simple project that could be used to test this out.

    That would be necessary.

    a911d6ea-f979-42bf-9190-9549b9ba0dc7 commented 15 years ago

    Over 3 years ago I wrote a plugin for the Xplane Flight Simulator. This uses a SDK that Ben Supnik and myself created 6 years ago. Our plugins are DLL's that our plugin manager DLL loads at run time.

    The plugin embeds python and allows python scripts to be run from within Xplane. This has worked fine with python 2.3.x, 2.4.x, 2.5.x and 2.6.0. I see the same problem as described here when using 2.6.1 with the ctypes, socket libraries etc. This also affects PyOpenGL as it uses ctypes.

    I applied the patch that Mark posted and rebuilt python26.dll. I tested this on Windows XP and Windows 7 Beta. With the patched python26.dll my scripts now load properly. The script that uses PyOpenGL also works fine now.

    I thought that this feedback may be useful.

    61337411-43fc-4a9c-b8d5-4060aede66d0 commented 15 years ago

    Mark, the patch is fine, please apply. Great work!

    mhammond commented 15 years ago

    ack - I mis-clicked and accidentally removed message78811 and can't see how to reinstate it. The message isn't critical, but I'm sorry about that!

    mhammond commented 15 years ago

    Checked in to the trunk as as r69038 and svnmerge'd:

    ded0c262-38bc-4cf4-afdc-7eeb2135d11e commented 15 years ago

    After some Googling, I found a possible solution:

    http://lists.wxwidgets.org/pipermail/wxpython-users/2008-November/081981.html

    71caead8-e0d0-4e76-afe7-f02fa56c30ec commented 12 years ago

    For some reason this patch is disabled with #ifdefs when Py_ENABLE_SHARED is not defined.

    I'm compiling python as a static lib and have unresolved externals for both these functions: _Py_ActivateActCtx, _Py_DeactivateActCtx