enthought / comtypes

A pure Python, lightweight COM client and server framework, based on the ctypes Python FFI package.
Other
290 stars 97 forks source link

Is required to have installed Microsoft Word to work with Word Documents? #495

Closed vonyang closed 1 year ago

vonyang commented 1 year ago

I have this question, because it says that it's not necessary to have installed a MS application but it gives me an error that says that I need the application. So, I want to know more about it. Thanks.

junkmd commented 1 year ago

I don't know the specifics of what you were trying to do, but comtypes can only utilize the functionality of COM libraries available in the execution environment.

In most environments, scrrun.dll and stdole2.tlb are existed, allowing you to use the functionality of these COM libraries with comtypes. To use the COM library features of MS applications like Excel or Word, those applications must be installed.

Out of curiosity, where did you come across the description that "it's not necessary to have installed a MS application"? Perhaps it was meant to convey that "even without MS applications being installed, you can still use the functionality of other COM libraries".

vonyang commented 1 year ago

So, can I convert a word document with .docx extension to pdf without having MS Word?? If I can, can you explain me how, or how to check if I have those files (scrrun.dll and stdole2.tlb) And I got that description because in others libraries like docx2pdf (it says) or pywin32 (it doesn't say but you need to have installed) you need to have installed the MS application

junkmd commented 1 year ago

No, you can't use Word COM library unless MS Word App is installed in your machine.

You can determine whether comtypes can access COM objects to manipulate a specific application by passing the VersionIndependentProgID(like 'Excel.Application') to comtypes.client.CreateObject. If a COM object for that application is able to use in your environment, a POINTER will be returned.

The following is a demonstration in my environment where Word is installed, but Visio is not.

>>> import comtypes.client
>>> comtypes.client.CreateObject('Word.Application') 
<POINTER(_Application) ptr=0x18df5bd17c8 at 18df5a28ad0>
>>> comtypes.client.CreateObject('Visio.Application') 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "...\comtypes\client\__init__.py", line 241, in CreateObject
    clsid = GUID.from_progid(progid)
            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "...\comtypes\GUID.py", line 86, in from_progid
    _CLSIDFromProgID(text_type(progid), byref(inst))
  File "_ctypes/callproc.c", line 1000, in GetResult
OSError: [WinError -2147221005] クラス文字列が無効です

Mentioning scrrun.dll and stdole2.tlb was simple examples of COM libraries that can be used without the need to install special applications. It doesn't mean that you can manipulate Word using scrrun.dll or stdole2.tlb without installing Word itself.

vonyang commented 1 year ago

Thanks for the explaining, basically you need a MS Word to convert a .docx document into .pdf document without lossing the style, images, etc. I tried with others methods, like textract, pypandoc, unoconv, mammoth and html2pdf but all of them only transform the text without any style.