decalage2 / olefile

olefile is a Python package to parse, read and write Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format), such as Microsoft Office 97-2003 documents, vbaProject.bin in MS Office 2007+ files, Image Composer and FlashPix files, Outlook messages, StickyNotes, several Microscopy file formats, McAfee antivirus quarantine files, etc.
http://www.decalage.info/olefile
Other
222 stars 76 forks source link

Make "olefile" work with Jython. #24

Open decalage2 opened 9 years ago

decalage2 commented 9 years ago

Originally reported by: Niko Ehrenfeuchter (Bitbucket: ehrenfeu, GitHub: ehrenfeu)


This issue is mostly meant as notification...

I'm having a scenario where I'd like to use olefile with Jython (the Java implementation of Python), which is unfortunately not 100% compatible to the CPython reference, especially when it comes to data types and number interpretations.

With a lot of debugging I managed to create prototype which is working with Jython 2.7-beta1 on Java 6 / 64-bit (unfortunately I can't use the latest Jython 2.7-beta3 as I am limited to Java 6, for which support was dropped in 2.7-beta2).

It will require a bit more polishing until I can file a pull request. Should be doable in the next days.

@decalage : on top of the Jython stuff, I'd have a couple of suggestions to make the code follow PEP-8 a little better. Are you interested in those?

Cheers ~Niko


decalage2 commented 9 years ago

Original comment by Niko Ehrenfeuchter (Bitbucket: ehrenfeu, GitHub: ehrenfeu):


Thanks for merging. Feel free to remind me if I don't manage to provide minimal instructions for Jython-testing within the next couple of weeks!

decalage2 commented 9 years ago

Original comment by Philippe Lagadec (Bitbucket: decalage, GitHub: decalage2):


OK, I merged pull request #8 as the changes were small.

decalage2 commented 9 years ago

Original comment by Niko Ehrenfeuchter (Bitbucket: ehrenfeu, GitHub: ehrenfeu):


Hi Philippe,

I'll do so, but it might take a while - I'm super busy at the moment. Regarding CPython, I've done some preliminary (manual) tests, as this is anyway also my main platform, and it worked as before. As the changes are only very minimal, and happen only while initially parsing the file (at least as far as I can tell) I wouldn't expect any noticable impact on speed either - but some more serious testing probably wouldn't hurt.

PEP-8 stuff will (of course) go into a separate PR, so you can review it and decide whether you like it or not. However, this will probably take even longer than the Jython stuff ;-)

Cheers ~Niko

decalage2 commented 9 years ago

Original comment by Philippe Lagadec (Bitbucket: decalage, GitHub: decalage2):


Hi Niko, thanks a lot. Yes, some instructions how to test it with Jython would help. Then I'll merge the pull request. Looking at the code, it should not have an impact on CPython compatibility.

For the PEP-8 suggestions, please send me a separate message by e-mail or bitbucket.

decalage2 commented 9 years ago

Original comment by Niko Ehrenfeuchter (Bitbucket: ehrenfeu, GitHub: ehrenfeu):


I should probably provide some instructions about how to test olefile with Jython. Please feel free to contact me if I forget about this...

decalage2 commented 9 years ago

Original comment by Niko Ehrenfeuchter (Bitbucket: ehrenfeu, GitHub: ehrenfeu):


I've filed pull request #8 with my changes to make olefile usable with Jython.

decalage2 commented 7 years ago

Hi @ehrenfeu, it's been a while but maybe you can still provide the instructions how to run olefile with Jython?

Otherwise I will simply close this issue.

decalage2 commented 7 years ago

It seems possible to run Travis CI with Jython, that might be a solution to test this: https://www.topbug.net/blog/2012/05/27/use-travis-ci-with-jython/

ehrenfeu commented 7 years ago

Unfortunately I still haven't found the time to dig through this again. It's still on my list though, I haven't forgotten about it!

ehrenfeu commented 6 years ago

Just FYI, we now (finally!) have some demand to dedicate time to our project again which is involving olefile, so I will hopefully be able to come up with some Jython test instructions soon!

ehrenfeu commented 6 years ago

Now here are some very basic instructions on how to (manually) test olefile through Jython by parsing some Olympus OIB files. I'm just writing them down here so I don't forget and you get an idea how to do it, if time permits I might go and write some unit tests from them (or provide scripts to facilitate testing).

Requirements

Setup

Assuming we're in a shell in a freshly cloned olefile repository, we can run these commands to prepare and start a Jython session:

cd test/images
wget https://github.com/decalage2/olefile/files/2270413/test-olympus-oib-file.oib.zip
unzip test-olympus-oib-file.oib.zip

cd ../../olefile
java -jar /path/to/your/jython-standalone-2.7.1.jar

This will get you an interactive Jython shell, ready to import olefile and start investigating:

import olefile
import codecs

ole = olefile.OleFileIO('../tests/images/test-olympus-oib-file.oib')

stream = ole.openstream(['OibInfo.txt'])
conv = codecs.decode(stream.read(), 'utf16')
print conv

stream = ole.openstream(['Storage00001', 'Stream00060'])
conv = codecs.decode(stream.read(), 'utf16')
print conv

In case of success this prints the details from the selected sections of the OLE file. Will update this post with the outputs later...