documentcloud / docsplit

Break Apart Documents into Images, Text, Pages and PDFs
http://documentcloud.github.com/docsplit/
Other
832 stars 214 forks source link

Pseudo password protected xlsx files can't be converted #78

Open alxndrmlr opened 11 years ago

alxndrmlr commented 11 years ago

I've built a tool that accepts documents via upload and converts them to images for viewing in the browser.

I've come across a particular xlsx file that was not able to be opened and converted.

Exception in thread "main" org.artofsolving.jodconverter.office.OfficeException: could not load document: 728.xlsx
    at org.artofsolving.jodconverter.AbstractConversionTask.loadDocument(AbstractConversionTask.java:92)
    at org.artofsolving.jodconverter.AbstractConversionTask.execute(AbstractConversionTask.java:59)
    at org.artofsolving.jodconverter.office.PooledOfficeManager$2.run(PooledOfficeManager.java:80)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
    at java.lang.Thread.run(Thread.java:680)
Caused by: com.sun.star.lang.IllegalArgumentException: URL seems to be an unsupported one.
    at com.sun.star.lib.uno.environments.remote.Job.remoteUnoRequestRaisedException(Job.java:177)
    at com.sun.star.lib.uno.environments.remote.Job.execute(Job.java:143)
    at com.sun.star.lib.uno.environments.remote.JobQueue.enter(JobQueue.java:335)
    at com.sun.star.lib.uno.environments.remote.JobQueue.enter(JobQueue.java:304)
    at com.sun.star.lib.uno.environments.remote.JavaThreadPool.enter(JavaThreadPool.java:91)
    at com.sun.star.lib.uno.bridges.java_remote.java_remote_bridge.sendRequest(java_remote_bridge.java:639)
    at com.sun.star.lib.uno.bridges.java_remote.ProxyFactory$Handler.request(ProxyFactory.java:151)
    at com.sun.star.lib.uno.bridges.java_remote.ProxyFactory$Handler.invoke(ProxyFactory.java:133)
    at com.sun.proxy.$Proxy4.loadComponentFromURL(Unknown Source)
    at org.artofsolving.jodconverter.AbstractConversionTask.loadDocument(AbstractConversionTask.java:90)
    ... 8 more

While I understand that this tool wouldn't be able to open the document if it was truly password protected, however in this case it wasn't.

When I preview it in quicklook it appears to be password protected.

screen shot 2013-05-07 at 2 45 45 pm

However I can open the file without being prompted for a password. Lastly when I try to close the file it prompts me to save it although I made no changes. If I save it, it no longer appears as password protected to quicklook and converts properly with docsplit.

While I realize this is not necessarily an issue with docsplit, I feel as though this file somehow got into a corrupt state. Not sure if there are utilities that can detect and remedy that programatically, but I figured I'd add the issue to the backlog here.

knowtheory commented 11 years ago

Hey @alxndrmlr DocSplit uses LibreOffice to open xls files. I'd be curious if you could fire that up and see how it treats the document.

alxndrmlr commented 11 years ago

Looks like it doesn't know what to do with it.

screen shot 2013-05-07 at 3 06 37 pm

When I do the "hack" of opening it in MS Excel for Mac and Save it (as prompted and mentioned above) I can then open it in LibreOffice no problem.