microsoft / Simplify-Docx

Simplify DOCX files to JSON
MIT License
219 stars 46 forks source link

AttributeError: 'NoneType' object has no attribute 'abstractNumId' #29

Open weiwei0519 opened 1 year ago

weiwei0519 commented 1 year ago

When I parse a docx file, I met below exception. How can I fix it. Thanks.

File "C:/WeiWei/sourcecode/Python/gpt-doc-qa/loader/parse_docx.py", line 216, in main doc_org_dict = simplify(doc) File "C:\Users\nsnp577\Anaconda3\envs\gpt-doc-qa\lib\site-packages\simplify_docx__init__.py", line 33, in simplify out = document(doc.element).to_json(doc, _options) File "C:\Users\nsnp577\Anaconda3\envs\gpt-doc-qa\lib\site-packages\simplify_docx\elements\base.py", line 106, in to_json "VALUE": [ elt.to_json(doc, options) for elt in self], File "C:\Users\nsnp577\Anaconda3\envs\gpt-doc-qa\lib\site-packages\simplify_docx\elements\base.py", line 106, in "VALUE": [ elt.to_json(doc, options) for elt in self], File "C:\Users\nsnp577\Anaconda3\envs\gpt-doc-qa\lib\site-packages\simplify_docx\elements\body.py", line 25, in to_json JSON = elt.to_json(doc, options, iter_me) File "C:\Users\nsnp577\Anaconda3\envs\gpt-doc-qa\lib\site-packages\simplify_docx\elements\paragraph.py", line 167, in to_json _indent = get_paragraph_ind(self.fragment, doc) File "C:\Users\nsnp577\Anaconda3\envs\gpt-doc-qa\lib\site-packages\simplify_docx\utils\paragrapy_style.py", line 54, in get_paragraph_ind num_style = get_num_style(p, doc) File "C:\Users\nsnp577\Anaconda3\envs\gpt-doc-qa\lib\site-packages\simplify_docx\utils\paragrapy_style.py", line 28, in get_num_style

asmaier commented 8 months ago

I see the same error when trying to parse a word document (I'm using python-docx 1.1.0 and simplify-docx 0.1.2)


simplify_docx/utils/paragrapy_style.py:28, in get_num_style(p, doc)
     25 # the map between numbering id and the numbering style
     26 num = np.element.find("w:num[@w:numId='%s']" % p.pPr.numPr.numId.val,
     27                        np.element.nsmap)
---> 28 _path = "w:abstractNum[@w:abstractNumId='%s']" % num.abstractNumId.val
     29 # the numbering styles themselves
     30 abstractNumbering = np.element.find(_path, np.element.nsmap)

AttributeError: 'NoneType' object has no attribute 'abstractNumId'
shenxiaochenn commented 7 months ago

In fact, I have the same issue!