tsgrp / OpenContent

TSG's Web Services for ECM Repositories
8 stars 4 forks source link

Automating creation of TSG renditions of Office docs #19

Open cmlewis opened 10 years ago

cmlewis commented 10 years ago

Alfresco 4.2.+ doesn't allow Office documents to be transformed into PDFs by default (seriously?). Properties must be added to alfresco-global.properties to enable this, or AlfrescoEmbUtil.transformNativeContentAsTSGRendition will never succeed in creating the faux rendtion for tsg:rendition. I tried adding a new alfresco-global.properties file with the necessary props to alfresco\tomcat\webapps\alfresco\WEB-INF\classes\alfresco\module\com.tsgrp.opencontent, but it does not pick them up. According to @parzgnat, an alfresco-global.properties in the module should be picked up, but I could not get it to work.

For now, props must go in the alfresco/tomcat/shared/classes/alfresco-global.properties. We should figure out how to get the file in the module to work so we can commit to trunk.

The properties are as follows:

content.transformer.complex.JodConverter.PdfBox.extensions.xlsm.txt.supported=true
content.transformer.complex.JodConverter.PdfBox.extensions.pptm.txt.supported=true
content.transformer.complex.JodConverter.PdfBox.extensions.xls.txt.supported=true
content.transformer.complex.JodConverter.PdfBox.extensions.sldm.txt.supported=true
content.transformer.complex.JodConverter.PdfBox.extensions.xltx.txt.supported=true
content.transformer.complex.JodConverter.PdfBox.extensions.docx.txt.supported=true
content.transformer.complex.JodConverter.PdfBox.extensions.potx.txt.supported=true
content.transformer.complex.JodConverter.PdfBox.extensions.xlsx.txt.supported=true
content.transformer.complex.JodConverter.PdfBox.extensions.pptx.txt.supported=true
content.transformer.complex.JodConverter.PdfBox.extensions.xlam.txt.supported=true
content.transformer.complex.JodConverter.PdfBox.extensions.ppt.txt.supported=true
content.transformer.complex.JodConverter.PdfBox.extensions.docm.txt.supported=true
content.transformer.complex.JodConverter.PdfBox.extensions.xltm.txt.supported=true
content.transformer.complex.JodConverter.PdfBox.extensions.dotx.txt.supported=true
content.transformer.complex.JodConverter.PdfBox.extensions.sldx.txt.supported=true
content.transformer.complex.JodConverter.PdfBox.extensions.ppsm.txt.supported=true
content.transformer.complex.JodConverter.PdfBox.extensions.ppam.txt.supported=true
content.transformer.complex.JodConverter.PdfBox.extensions.dotm.txt.supported=true
content.transformer.complex.JodConverter.PdfBox.extensions.doc.txt.supported=true
content.transformer.complex.JodConverter.PdfBox.extensions.ppsx.txt.supported=true

(Note these properties and other transformation props can be found in alfresco\tomcat\webapps\alfresco\WEB-INF\classes\alfresco\subsystems\Transformers\default\transformers.properties)

joehof commented 10 years ago

So I was noticing that docx was not renditioning and did some digging. There are some other properties we should be adding in addition to the above to ensure microsoft docs rendition:

path of the OpenOffice.org or LibreOffice installation ooo.exe=C:/alfresco/alfresco-4.2.1/libreoffice/App/libreoffice/program/soffice.exe ooo.enabled=true

This will act as kind of a "catch-all" so both services will have a crack at getting microsoft docs renditioned.

gsteimer commented 10 years ago

I have the latest renditioning code on release2 and tried uploading a docx. The rendition worked fine, but it lost the table of contents on the first page for some reason. Here's the doc:

http://release2.tsgrp.com/hpi/Stage/controlleddocs/workspace://SpacesStore/12e15dde-ffd3-4b3b-90e3-83cbd0a36f5c%7Cworkspace://SpacesStore/1828e95b-057d-49b6-a974-f03f7a3c33a4

Let me know if something gets fixed with docx documents that I should merge to release2.