singerla / pptx-automizer

A template based pptx generator for Node.js
MIT License
61 stars 12 forks source link

Support for charts getting data from embedded excel sheets (rather than 'standard' charts) #102

Open MP70 opened 4 months ago

MP70 commented 4 months ago

setChartData throws

Error: Could not find file ppt/charts/undefinedundefined.xml@Recreated_Presentation_OUTO1.pptx at ArchiveJszip.<anonymous> (/Users/matt/dev/pptxtopptxgenjs/node_modules/pptx-automizer/dist/helper/archive/archive-jszip.js:57:23)

on these charts.

singerla commented 4 months ago

Hi @MP70! Can I help you on this? Please provide an example pptx file, I'm always interested in new varieties :)

MP70 commented 4 months ago

Sorry! As attached. So I think I have a solution but it is super super involved and therefore I'm not sure if it's best inside the library or not.

We have two options; 1/ Go edit the linked xlsx file 2/ Delete the linked xlsx file, and its refs element, then edit the chart structure to allow for embedding the data directly as normal.

Either are quite involved and potentially fragile.

Recreated_Presentation_OUTO1.pptx

MP70 commented 4 months ago

Example to get the data I am doing this ATM


  for (let element of elements) {
      let refElements = element.getElementsByTagName("c:numRef");
      if (refElements.length > 0) {
          let fElement = refElements[0].getElementsByTagName("c:f");
          if (fElement.length > 0) {
              let excelEmbedInfo = fElement[0].textContent;
              let extractedData = await this.extractExcelData(excelEmbedInfo, zip);
              data = data.concat(extractedData);
          } else {
              let valueElements = refElements[0].getElementsByTagName("c:v");
              let values = Array.from(valueElements, elem => parseFloat(elem.textContent));
              data = data.concat(values);
          }
      }
  }
  return data;
}

async extractExcelData(excelEmbedInfo, zip) {
  console.log(excelEmbedInfo)
  const excelFilePath = excelEmbedInfo.name

  if (excelFilePath) {
     const excelData = await zip.files[excelFilePath].async("nodebuffer");
     console.log(excelData);
      return this.parseExcelData(excelData);
  } else {
      console.error(`Excel file not found at path: ${excelFilePath}`);
      return [];
  }
}

parseExcelData(excelBuffer) {
  // currently using 'xlsx' library to parse Excel buffer, would prob need to add this dep..
  let workbook = XLSX.read(excelBuffer, {type: 'buffer'});
  let sheetName = workbook.SheetNames[0];
  let worksheet = workbook.Sheets[sheetName];
  let data = XLSX.utils.sheet_to_json(worksheet, {header:1});
  return data; // Flatten if the data structure is nested
}

that returns


  [ '', 'Series 1', 'Series 2', 'Series 3' ],
  [ 'Kategorie 1', 20, 20, 20 ],
  [ 'Kategorie 2', 10, 10, 10 ],
  [ 'Kategorie 3', 10, 10, 10 ],
  [ 'Kategorie 4', 10, 10, 10 ]
]```
singerla commented 4 months ago

The Error: Could not find file is basically caused by an absolute path in slide relations file:

<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
  <Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/chart" Target="/ppt/charts/chart1.xml"/>

Please see my latest commit. Have a great week! :)

singerla commented 4 months ago

Hey @MP70, I've added an example to read data from a chart worksheets. I'm using a normal modifier to walk through rows and columns. I think it would be useful to have more "readers" in the future.

llermaly commented 1 month ago

Something to add to this discussion is the existing modifier will clean the underlying excel file but will not clean formulas, so if you modify a chart it will look fine, but then if you click "edit excel data" formulas will trigger, and if in the original file cells where references to elsewhere then the chart will break.

This approach gives room to control things like that?

singerla commented 1 month ago

You could try to invoke a library like node-xlsx or the underlying sheetjs if you need more complex manipulation of xlsx worksheets. You just need to override the worksheet after automation. In my use cases, I always need to remove formulas on preparing a chart and do calculations elsewhere.