Open joernhees opened 4 years ago
I looked at the ods files provided by @joernhees and discovered the
It looks like https://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part1.html#__RefHeading__1415200_253892949 is the relevant documentation of what to do.
debugging https://github.com/pandas-dev/pandas/issues/32207 i now think that this probably belongs here somewhere...
Code Sample
Create a new spreadsheet with 1 column "testcol" in LibreOffice / OpenOffice & Excel, save as
test.ods
/test.xlsx
:For simplicity here as zip: (1 ods, 1 xlsx): spreadsheets.zip
Problem description
When reading
.ods
files (OpenOffice or LibreOffice) multiple spaces are collapsed into one, leading ones are lost, trailing ones preserved.Expected Output
See excel output above.
debugging so far:
Digging into this, it seems that pandas when getting the cell's value here actually already gets a cell from which the original string isn't re-constructable. In the debugger it seems that the cell
4 spaces
is actually already parsed into 3 childnodes, where the' '
end up as anElement
that doesn't print its whitespace only values whenstr(cell)
is called:Sadly, at this point i end up running into your parsing code... and i have to say that i'm lost...