jmcnamara / XlsxWriter

A Python module for creating Excel XLSX files.
https://xlsxwriter.readthedocs.io
BSD 2-Clause "Simplified" License
3.65k stars 633 forks source link

Bug: file corrupt while writing with pandas #962

Closed buzMartin closed 1 year ago

buzMartin commented 1 year ago

Current behavior

I have a workbook contains strings like below: "=======new line=====" Then use pandas read this workbook and write with xlsxwriter. The file turns out corrupt. I have tested many times, if I use xlsxwriter as pandas engine, it cannot write this kind of string in the right way. Of course I can manipulate the dataframe and add a single quote at the begining for each strings that start with "==", but it's not a good choice.

Expected behavior

If some strings start with "==", I hope the code below still works and will give me an excel file that not corrupted.

writer = pd.ExcelWriter(r'path.xlsx',engine='xlsxwriter')
df.to_excel(writer,sheet_name='Sheet1')
writer.book.close()

Sample code to reproduce

import xlsxwriter
import pandas as pd
df = pd.DataFrame(data={"a":["====new line===",""],"b":["",""]})
writer = pd.ExcelWriter(r'test.xlsx')
df.to_excel(writer)
writer.book.close()

Then open the test.xlsx, excel will alert that this file is corrupted.

Environment

- XlsxWriter version: 3.0.8
- Python version:3.11.0
- Excel version: microsoft 365
- OS: Windows 10

Any other information

No response

OpenOffice and LibreOffice users

jmcnamara commented 1 year ago

Thanks for the bug report.

You can turn off the XlsxWriter option to automatically convert strings starting with "=" to into formulas by using the strings_to_formulas Constructor option.

You can pass it to pandas like this:

import pandas as pd

df = pd.DataFrame(data={"a": ["====new line===", ""], "b": ["", ""]})

writer = pd.ExcelWriter('test.xlsx',
                        engine='xlsxwriter',
                        engine_kwargs={'options': {'strings_to_formulas': False}})

df.to_excel(writer)

writer.book.close()

Output:

screenshot

See also: https://xlsxwriter.readthedocs.io/working_with_pandas.html#passing-xlsxwriter-constructor-options-to-pandas

buzMartin commented 1 year ago

Thanks! That's great