OfficeDev / Open-Xml-PowerTools

MIT License
692 stars 26 forks source link

can compare for excel files and pdf files? #250

Closed taiyangluoshan88 closed 5 years ago

taiyangluoshan88 commented 6 years ago

now i want to compare the excel and pdf files,but do not know if this lib support it.

MalcolmJohnston commented 6 years ago

Hello,

This library does not have any functionality for comparing XSLX files with PDF's generated from XSLX.

I do not know the it's and puts of the PDF format but you would probably have better luck trying to find something that will convert PDF back to source first.

taiyangluoshan88 commented 6 years ago

Hi MalcolmJohnston, My measn is i wounder if this lib can compare one excel with another excel file, or compare one pdf file with another pdf file, do not need compare pdf file with excel file.

MalcolmJohnston commented 6 years ago

Hi @taiyangluoshan88 ,

Thanks for getting back to me, from the read me for this project :)

It supports scenarios such as:

  • Splitting DOCX/PPTX files into multiple files.
  • Combining multiple DOCX/PPTX files into a single file.
  • Populating content in template DOCX files with data from XML.
  • High-fidelity conversion of DOCX to HTML/CSS.
  • High-fidelity conversion of HTML/CSS to DOCX.
  • Searching and replacing content in DOCX/PPTX using regular expressions.
  • Managing tracked-revisions, including detecting tracked revisions, and accepting tracked revisions.
  • Updating Charts in DOCX/PPTX files, including updating cached data, as well as the embedded XLSX.
  • Comparing two DOCX files, producing a DOCX with revision tracking markup, and enabling retrieving a list of revisions.
  • Retrieving metrics from DOCX files, including the hierarchy of styles used, the languages used, and the fonts used.
  • Writing XLSX files using far simpler code than directly writing the markup, including a streaming approach that enables writing XLSX files with millions of rows.
  • Extracting data (along with formatting) from spreadsheets.

So there isn't anything out of the box. You might want to look at the OpenXML SDK and see what objects are exposed for working with spreadsheets. If the spreadsheets you are looking to compare have similar structures then you might be able to build some simple code on top of that.

Cheers, Malcolm

ThomasBarnekow commented 6 years ago

Both the Open XML SDK and the PowerTools privide ways to read and write Open XML elements. The SDK provides strongly typed classes and the PowerTools provide XNames used with Linq to XML.

Implementing your comparison will not be simple, though.

tomjebo commented 5 years ago

Closing all issues as this repo is being archived and will no longer be maintained by Microsoft. The project is licensed for continued use and development by forking to your own repo.