elapouya / python-docx-template

Use a docx as a jinja2 template
GNU Lesser General Public License v2.1
1.98k stars 385 forks source link

Render silently fail when encountering "<na>" string[pyarrow] dtype #431

Open arkanoid87 opened 2 years ago

arkanoid87 commented 2 years ago

Describe the bug

I was dealing with truncated output when rendering large table, I found out it was due to missing values when using dtype string[pyarrow]

The problem doesn't happen when using other string representation.

As a workaround, I'm detecting missing values using pd.na

if pd.isna(record['key']):
    record['key'] = 'placeholder'

To Reproduce

from docxtpl import DocxTemplate
import numpy as np
import pandas as pd

df = pd.DataFrame({
    "missing": pd.Series(["1", np.nan, "3"], dtype="string[pyarrow]")
})

###
# {%p for item in contents %}
# {{ item['missing'] }}
# {%p endfor %} 
###
template_path = "experimental_template.docx"
doc = DocxTemplate(template_path)
context = { 'contents': df.to_dict(orient='records')}
doc.render(context)
doc.save("experimental_result.docx")

Expected behavior

1
nan
2

Actual output

1
elapouya commented 2 years ago

I think this is more a Jinja2 problem, you should open a bug to them.

elapouya commented 2 years ago

More precisely, jinja2 will interpret your string[pyarrow] np.nan not like 'nan' but maybe something that is not XML compatible string (ie : <object NaN or something like that> ) So the workaround would be to transform your strings[pyarrow] into simple string.

arkanoid87 commented 2 years ago

exactly. It outputs <NA> as string representation of nan

elapouya commented 2 years ago

Did you try to render with autoescape=True ?

Le sam. 23 avr. 2022 à 10:00, arkanoid87 @.***> a écrit :

exactly. It outputs as string representation of nan

— Reply to this email directly, view it on GitHub https://github.com/elapouya/python-docx-template/issues/431#issuecomment-1107421430, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGY33BFTGEECYDF4UIXWEDVGOUZDANCNFSM5UA6HZ2A . You are receiving this because you commented.Message ID: @.***>

wangchenguang123 commented 1 year ago

I find the same problem in my project. And then the author way can solve this problem. the way is render with autoescape=True. Thank a lot