eea / odfpy

API for OpenDocument in Python
GNU General Public License v2.0
311 stars 64 forks source link

Linebreaks not picked up #114

Open baloe opened 2 years ago

baloe commented 2 years ago

Linebreaks within a text cell are not read.

Here's a small example: doc_content I saved this sheet as doc.ods and doc.xlsx.

The following script

#!/usr/bin/env python3

import pandas as pd

print('\nRead with odf:'):
data = pd.read_excel( 'doc.ods', engine='odf' )
print(data)

print('\nRead with openpyxl:'):
data = pd.read_excel( 'doc.xlsx', engine='openpyxl' )
print(data)

prints

Read with odf:
                   testdata
0  cell without a linebreak
1    cell with a line break

Read with openpyxl:
                   testdata
0  cell without a linebreak
1  cell with \na line break

lacking the newline character \n in the pandas dataframe produced through odf.

Versions:

# Name                    Version                   Build  Channel
pandas                    1.3.5            py38h43a58ef_0    conda-forge
odfpy                     1.4.1                      py_0    conda-forge
Obsnold commented 2 years ago

I just had the same issue. Here is an example just using odfpy

#!/usr/bin/env python3

import sys
from odf.opendocument import load
from odf.table import Table, TableRow, TableCell

infile = sys.argv[1]
doc = load(infile)

cell= doc.getElementsByType(Table)[0].getElementsByType(TableRow)[2].getElementsByType(TableCell)[0]

print(cell)

Using the same spreadsheet as above you get the output:

cell with A line break
achaiah commented 2 years ago

Yes, same issue here. Is there a fix?

Tuhin-thinks commented 6 months ago

Index Col\nNext Line is getting read as Index ColNext Line

Checked in version: 1.4.1

buhtz commented 6 months ago

Please see #123 about the project status. The project is nearly orphaned.

Tuhin-thinks commented 6 months ago

Sorry to see that this project has gone stale.

Will explore jdum/odfdo as referenced in #123

Thanks 👍🏻

Icemole commented 8 hours ago

Thanks for the report, the issue was driving me crazy. Sadly, odfdo doesn't seem to be a drop-in replacement for odfpy. I'll stick to using openpyxl for now. Thank you everyone who contributed to the thread!