pjrinaldi / wombatforensics

linux c++, fox-toolkit, multi-threaded forensic gui tool
GNU General Public License v2.0
47 stars 12 forks source link

Microsoft Office Viewers (doc/docx) (xls/xlsx) (ppt/pptx) #269

Open pjrinaldi opened 5 years ago

pjrinaldi commented 5 years ago

Implement viewers for doc, docx, xls, xlsx, ppt, pptx, etc.

pjrinaldi commented 5 years ago

I like the idea of using catdoc and docx2txt to dump readable text and then display that content... I could also use xls2csv, catppt and other such utilities to simply display the readable text... Images would be lost, but i could carve those out..

There is also apache-tiki, which is a java thing. I would have to call the command line to parse the files to something I could display...

pjrinaldi commented 1 year ago

I implemented code to pull out the readable text from a word document and display it in the plain text viewer. I will do the same for pptx, xlsx, html, and every other artifact.

pjrinaldi commented 1 year ago

doc, xls, and ppt are cfb files, so i should be able to use my code to get there on viewers to pull out the text.