qrilka / xlsx

Simple and incomplete Excel file parser/writer
MIT License
132 stars 64 forks source link

wrong parameter seperators? #174

Closed KeesBleijenberg closed 8 months ago

KeesBleijenberg commented 8 months ago

In Excel I have created a xlsx with in cell A1 the formula =SUM(3;4) which gives 7. When I read the formula in Codec.Xlsx I get "SUM(3,4)". The semicolon is replaced bij a comma. I thought maybe it has to do with the language of the xlsx. I created a new xlsx in a Dutch version of Office. I had to fill in =SOM(3;4) SOM is dutch for SUM. I had to separate the params by a semicolon and in Codec.Xlsx the formula is "SUM(3,4)". Consistent, but what is going on?

qrilka commented 8 months ago

Hi @KeesBleijenberg I'm not that available this week because of my vacation but adding some test files and code you're using could be helpful. By itself the library doesn't do anything (or at least shouldn't do anything) to transform what's is in the files themselves. In practice you could just open the file in question as a zip archive and see what's is in its XML.

KeesBleijenberg commented 8 months ago

Enjoy your vacation!

Hi @KeesBleijenberg I'm not that available this week because of my vacation but adding some test files and code you're using could be helpful. By itself the library doesn't do anything (or at least shouldn't do anything) to transform what's is in the files themselves. In practice you could just open the file in question as a zip archive and see what's is in its XML.

This is the xlsx test.xlsx

This is a screendump of creating the cell in a Dutch version of xlsx excelNL

The test program is below. The output is "SUM(3,4)"

module Main (main)

where

import Codec.Xlsx import qualified Data.ByteString.Lazy as L import Data.Map as M

main :: IO () main = do let fn = "/home/kees/testXlsx/test.xlsx" bs <- L.readFile fn let xlsx = toXlsx bs let ws = _wsCells $ snd $ head $ _xlSheets xlsx cell = ws M.! (RowIndex 1, ColumnIndex 1) :: Cell case _cellFormula cell of Just f -> case cellfExpression f of NormalFormula fTxt -> print $ unFormula fTxt -> error "error" Nothing -> error "error"

qrilka commented 8 months ago

@KeesBleijenberg in the file I see this formula with a comma:

$ unzip -p Downloads/test.xlsx xl/worksheets/sheet1.xml ; echo
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" mc:Ignorable="x14ac" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac"><dimension ref="A1"/><sheetViews><sheetView tabSelected="1" workbookViewId="0"/></sheetViews><sheetFormatPr defaultRowHeight="15" x14ac:dyDescent="0.25"/><sheetData><row r="1" spans="1:1" x14ac:dyDescent="0.25"><c r="A1"><f>SUM(3,4)</f><v>7</v></c></row></sheetData><pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3"/></worksheet>

And in Overview of formulas I read:

  1. Structure. The structure of a function begins with an equal sign (=), followed by the function name, an opening parenthesis, the arguments for the function separated by commas, and a closing parenthesis.

So it looks like Excel just corrects your not quite correct formula. I found only the following about semicolons on that page:

Separate values in different rows by using semicolons (;). For example, to represent the values 10, 20, 30, and 40 in one row and 50, 60, 70, and 80 in the row immediately below, you enter a 2-by-4 array constant: {10,20,30,40;50,60,70,80}.

This doesn't seem to have anything to do with the library itself as it's a "feature" of Excel itself. E.g. in LibreOffice I see "SUM(3,4)" when opening this file, the same in Office 365. And Google Sheets uses semicolon as a separator, so I see "SUM(3;4)" when opening this file. Office 365 shows "=SUM(1;2)" as invalid formula when I try to enter it - sum

KeesBleijenberg commented 8 months ago

I thought this had to do with the language of Office. But it's a consequence of the language settings in the OS. Look at https://edu.gcfglobal.org/en/excel-tips/understanding-regional-differences-in-excel/1/ Problem solved. Sorry for the confusion.

qrilka commented 8 months ago

NP, this helped me learn something new :) Not quite sure how useful this "feature" is though... Closing then.