nacnudus / tidyxl

Read untidy Excel files in R https://nacnudus.github.io/tidyxl/
https://nacnudus.github.io/tidyxl/
Other
248 stars 21 forks source link

Identifying pivot tables in sheets #81

Open sch56 opened 2 years ago

sch56 commented 2 years ago

Perhaps I'm missing something, but it would be great to be able to identify a pivot table within a sheet. It appears that the cell formatting of cells in the pivot table (as read by xlsx_cells and the corresponding xls_formats) are not held with the cell, but reflects the sheet 'underneath' the pivot table. There do not appear to be any 'names' or 'formulas' associated with the pivot table, so it is hard to detect these from the data content.

It would be great to be able to read in information or even simply location references to pivot tables.

nacnudus commented 2 years ago

Thanks for the suggestion.

Notes to self

Docs are in ECMA part I, from page 1819.

Each worksheet has a file in xl/worksheets/_rels, e.g. xl/worksheets/_rels/sheet1.xml.rels. This points to any files that describe pivot tables.

<?xml version="1.0" encoding="UTF-8"?>
<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships"><Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/pivotTable" Target="../pivotTables/pivotTable1.xml"/>
</Relationships>

A pivot table file xl/pivotTables/pivotTable1.xml describes its position in the sheet (but doesn't name the sheet).

<pivotTableDefinition xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" name="DataPilot1" cacheId="1" applyNumberFormats="0" applyBorderFormats="0" applyFontFormats="0" applyPatternFormats="0" applyAlignmentFormats="0" applyWidthHeightFormats="0" dataCaption="Values" useAutoFormatting="0" itemPrintTitles="1" indent="0" outline="0" outlineData="0" compact="0" compactData="0">
<location ref="D11:E14" firstHeaderRow="1" firstDataRow="1" firstDataCol="1"/>
<pivotFields count="2">
<pivotField axis="axisRow" compact="0" showAll="0" defaultSubtotal="0" outline="0">
<items count="2">
<item x="0"/>
<item x="1"/>
</items>
</pivotField>
<pivotField dataField="1" compact="0" showAll="0" outline="0"/>
</pivotFields>
<rowFields count="1">
<field x="0"/>
</rowFields>
<dataFields count="1">
<dataField name="Sum - y" fld="1" subtotal="sum" numFmtId="164"/>
</dataFields>
<pivotTableStyleInfo name="PivotStyleLight16" showRowHeaders="1" showColHeaders="1" showRowStripes="0" showColStripes="0" showLastColumn="1"/>
</pivotTableDefinition>

pivot.xlsx