qrilka / xlsx

Simple and incomplete Excel file parser/writer
MIT License
132 stars 64 forks source link

Proper support for cross-sheet references, was "Errors creating pivot tables in separate sheets" #160

Open LTibbetts opened 2 years ago

LTibbetts commented 2 years ago

I have followed the examples for pivot tables from the test directory and have been able to successfully generate an xlsx file with a working pivot table. However, when I attempt to move the pivot table to a separate sheet, the writer throws the following exception.

test-xlsx: Safe.fromJustNote Nothing, specified pivot table field does not exist
CallStack (from HasCallStack):
  fromJustNote, called at src/Codec/Xlsx/Writer/Internal/PivotTable.hs:153:11 in xlsx-0.8.2-I2QgKfEkcpi1sJcqZkt6qB:Codec.Xlsx.Writer.Internal.PivotTable

If there is a working example of pivot tables being written to a different sheet, I would love to see it. Any help would be appreciated.

The full code example of what I am currently trying is this with the pivot table in the first sheet and the data in the second sheet.


import Codec.Xlsx
import Control.Lens
import Control.Monad.State.Strict
import qualified Data.ByteString.Lazy as L
import qualified Data.Map as M
import Data.Time.Clock.POSIX

main :: IO ()
main = do
  ct <- getPOSIXTime
  let sheet1 = def {_wsPivotTables = [testPivotTable]}
  let sheet2 = def {_wsCells = testPivotSrcCells}
  let xlsx = def & atSheet "Sheet1" ?~ sheet1 & atSheet "Sheet2" ?~ sheet2
  L.writeFile "example.xlsx" $ fromXlsx ct xlsx

testPivotTable :: PivotTable
testPivotTable =
  PivotTable
    { _pvtName = "PivotTable1",
      _pvtDataCaption = "Values",
      _pvtLocation = CellRef "A1",
      _pvtSrcRef = CellRef "A1:D5",
      _pvtSrcSheet = "Sheet2",
      _pvtRowFields = [FieldPosition colorField],
      _pvtColumnFields = [],
      _pvtDataFields =
        [ DataField
            { _dfName = "Sum of Price",
              _dfField = priceField,
              _dfFunction = ConsolidateSum
            }
        ],
      _pvtFields =
        [ PivotFieldInfo (Just colorField) False FieldSortManual [],
          PivotFieldInfo (Just yearField) False FieldSortManual [],
          PivotFieldInfo (Just priceField) False FieldSortManual [],
          PivotFieldInfo (Just countField) False FieldSortManual []
        ],
      _pvtRowGrandTotals = True,
      _pvtColumnGrandTotals = False,
      _pvtOutline = False,
      _pvtOutlineData = False
    }
  where
    colorField = PivotFieldName "Color"
    yearField = PivotFieldName "Year"
    priceField = PivotFieldName "Price"
    countField = PivotFieldName "Count"

testPivotSrcCells :: CellMap
testPivotSrcCells =
  M.fromList $
    concat
      [ [((row, col), def & cellValue ?~ v) | (col, v) <- zip [1 ..] cells]
        | (row, cells) <- zip [1 ..] cellMap
      ]
  where
    cellMap =
      [ [CellText "Color", CellText "Year", CellText "Price", CellText "Count"],
        [CellText "green", CellDouble 2012, CellDouble 12.23, CellDouble 17],
        [CellText "white", CellDouble 2011, CellDouble 73.99, CellDouble 21],
        [CellText "red", CellDouble 2012, CellDouble 10.19, CellDouble 172],
        [CellText "white", CellDouble 2012, CellDouble 34.99, CellDouble 49]
      ]
qrilka commented 2 years ago

@LTibbetts so far xlsx was supporting references to the same sheet in most of the cases, in #157 @flhorizon added some rudimentary support for cross-sheet references but I think none of usages were made aware of them. Proper fix for this looks to require some effort (I guess we'd better make this visible on type level and not leave it to run time). I'm not sure I will have time soon to tackle this (at work I have switched to Rust some ago) but I will be glad to help discussing/reviewing the code.

LTibbetts commented 2 years ago

@qrilka Thanks for the quick reply! It is not important that I have the data and pivot table in different sheets, just a matter of stylistic preference, so I will continue with them in the same sheet. You may close the issue if you want to as I will likely not have the time to work on adding support for this, or leave it open for future users to find easily.

qrilka commented 2 years ago

I think we should leave it as something that's better be fixed, thanks

flhorizon commented 2 years ago

Hey

I didn't look into how PivotTable are actually meant to work, and whether cross-sheet CellRef should be involved.

But regarding that error, it seems like writer accumulator functions like generateCache should be able to access 2 cellMaps: the one of _pvtSrcSheet and the cellMap from the sheet that contains it.

The fact there's a single _pvtSrcSheet \<worksheetSource sheet="" ref=""> hints IMO the _pvtSrcRef is shaped like a local, relative ref

(looks like pivot tables references are either self->self or self->1 other).

All that to say there might be a quick win without changing the expectations about the cellref format.

I'd need to dissect a file and reads the specs though to be sure 😄


About refining the CellRef type, I've thought a bit about them while making the tests, could be:

But I digress, that's stuff for another issue