4teamwork / ftw.tika

This product integrates Apache Tika for full text indexing with Plone.
4 stars 1 forks source link

WARNING ftw.tika No path to Tika JAR file specified. #16

Closed wunderlins closed 10 years ago

wunderlins commented 10 years ago

Hi

I am having problems using few.tika with plone 4.3.I am using unified installer on debian and osx (10.9) and i am always getting the error, that the path to tika jar is not found.

my config:

buildout.cfg

[buildout]
package-name = ftw.tika
extends =
#    tika-plone-4.3.x.cfg
    buildout-dist.cfg
    tika.cfg

[client1]
#recipe = plone.recipe.zope2instance
zcml-additional += ${tika:zcml}
eggs += ftw.tika

tika.cfg (default from the git repo)

[buildout]
parts +=
    tika-download
    tika-server

[tika]
server-port = 8077
zcml =
    <configure xmlns:tika="http://namespaces.plone.org/tika">
        <tika:config path="${tika-download:destination}/${tika-download:filename}"
                     port="${tika:server-port}" />
    </configure>

[tika-download]
recipe = hexagonit.recipe.download
url = http://mirror.switch.ch/mirror/apache/dist/tika/tika-app-1.5.jar
md5sum = 2124a77289efbb30e7228c0f7da63373
download-only = true
filename = tika.jar

[tika-server]
recipe = collective.recipe.scriptgen
cmd = java
arguments = -jar ${tika-download:destination}/${tika-download:filename} --server --port ${tika:server-port} --text

and buildout-dist.cfg is the default from plone 4.3

I guess I am doing something wrong. any help would be appreciated.

Best -S

PS: i am running a separate process for tika therefore i do not understand the error message after adding a document. the tika-server starts fine

lukasgraf commented 10 years ago

ftw.tika first tries to convert documents using the tika-server (because it's much faster). If it can't connect to it, it uses the local tika.jar as a fallback, and for that it needs the path to that JAR file - if that path isn't defined, the fallback won't work.

So if your tika-server is running and ftw.tika can connect to it, this warning shouldn't be a problem. Can you tell if it can convert documents using the server process? You should see a message similar to Converting document XY with tika server in your instance log.

lukasgraf commented 10 years ago

It looks like your ZCML isn't getting loaded properly. Could you please paste the contents of your parts/client1/etc/package-includes/999-additional-overrides.zcml?

wunderlins commented 10 years ago

Hallo Lukas

looks like there is no parts/client1/etc/package-includes/999-additional-overrides.zcml in my parts directory.

wus@shell1:~/Plone/test1$ ls -la parts/client1/etc/
total 16
drwxr-xr-x 2 wus wus 4096 Aug  5 22:18 .
drwxr-xr-x 5 wus wus 4096 Aug  5 22:18 ..
-rw-r--r-- 1 wus wus  728 Aug  5 22:18 site.zcml
-rw-r--r-- 1 wus wus 1894 Aug  5 22:18 zope.conf```
lukasgraf commented 10 years ago

Then your zcml-additional line doesn't work.

wunderlins commented 10 years ago

Hi, thanks!

wus@shell1:~/Plone/test1$ ./bin/zeoserver fg &
[3] 13567
wus@shell1:~/Plone/test1$ ./bin/client1 fg &
[4] 13581
Traceback (most recent call last):
  File "/home/wus/Plone/test1/parts/client1/bin/interpreter", line 123, in <module>
    exec(compile(__file__f.read(), __file__, "exec"))
  File "/home/wus/Plone/buildout-cache/eggs/Zope2-2.13.22-py2.7.egg/Zope2/Startup/run.py", line 76, in <module>
    run()
  File "/home/wus/Plone/buildout-cache/eggs/Zope2-2.13.22-py2.7.egg/Zope2/Startup/run.py", line 22, in run
    starter.prepare()
  File "/home/wus/Plone/buildout-cache/eggs/Zope2-2.13.22-py2.7.egg/Zope2/Startup/__init__.py", line 86, in prepare
    self.startZope()
  File "/home/wus/Plone/buildout-cache/eggs/Zope2-2.13.22-py2.7.egg/Zope2/Startup/__init__.py", line 262, in startZope
    Zope2.startup()
  File "/home/wus/Plone/buildout-cache/eggs/Zope2-2.13.22-py2.7.egg/Zope2/__init__.py", line 47, in startup
    _startup()
  File "/home/wus/Plone/buildout-cache/eggs/Zope2-2.13.22-py2.7.egg/Zope2/App/startup.py", line 118, in startup
    load_zcml()
  File "/home/wus/Plone/buildout-cache/eggs/Zope2-2.13.22-py2.7.egg/Zope2/App/startup.py", line 52, in load_zcml
    load_site()
  File "/home/wus/Plone/buildout-cache/eggs/Zope2-2.13.22-py2.7.egg/Zope2/App/zcml.py", line 46, in load_site
    _context = xmlconfig.file(site_zcml)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/xmlconfig.py", line 649, in file
    include(context, name, package)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/xmlconfig.py", line 548, in include
    processxmlfile(f, context)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/xmlconfig.py", line 380, in processxmlfile
    parser.parse(src)
  File "/usr/lib/python2.7/xml/sax/expatreader.py", line 107, in parse
    xmlreader.IncrementalParser.parse(self, source)
  File "/usr/lib/python2.7/xml/sax/xmlreader.py", line 123, in parse
    self.feed(buffer)
  File "/usr/lib/python2.7/xml/sax/expatreader.py", line 207, in feed
    self._parser.Parse(data, isFinal)
  File "/usr/lib/python2.7/xml/sax/expatreader.py", line 349, in end_element_ns
    self._cont_handler.endElementNS(pair, None)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/xmlconfig.py", line 359, in endElementNS
    self.context.end()
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/config.py", line 558, in end
    self.stack.pop().finish()
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/config.py", line 706, in finish
    actions = self.handler(context, **args)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/xmlconfig.py", line 604, in includeOverrides
    include(_context, file, package, files)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/xmlconfig.py", line 548, in include
    processxmlfile(f, context)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/xmlconfig.py", line 380, in processxmlfile
    parser.parse(src)
  File "/usr/lib/python2.7/xml/sax/expatreader.py", line 107, in parse
    xmlreader.IncrementalParser.parse(self, source)
  File "/usr/lib/python2.7/xml/sax/xmlreader.py", line 123, in parse
    self.feed(buffer)
  File "/usr/lib/python2.7/xml/sax/expatreader.py", line 207, in feed
    self._parser.Parse(data, isFinal)
  File "/usr/lib/python2.7/xml/sax/expatreader.py", line 338, in start_element_ns
    AttributesNSImpl(newattrs, qnames))
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/xmlconfig.py", line 232, in startElementNS
    self.context.begin(name, data, info)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/config.py", line 555, in begin
    self.stack.append(self.stack[-1].contained(__name, __data, __info))
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/config.py", line 856, in contained
    return RootStackItem.contained(self, name, data, info)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/config.py", line 724, in contained
    factory = self.context.factory(self.context, name)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/config.py", line 501, in factory
    raise ConfigurationError("Unknown directive", ns, n)
zope.configuration.xmlconfig.ZopeXMLConfigurationError: File "/home/wus/Plone/test1/parts/client1/etc/site.zcml", line 19.2-19.64
    ZopeXMLConfigurationError: File "/home/wus/Plone/test1/parts/client1/etc/package-includes/999-additional-overrides.zcml", line 2.4
    ConfigurationError: ('Unknown directive', u'http://namespaces.plone.org/tika', u'config')
wus@shell1:~/Plone/test1$ cat parts/client1/etc/package-includes/999-additional-overrides.zcml 
<configure xmlns:tika="http://namespaces.plone.org/tika">
    <tika:config path="/home/wus/Plone/test1/parts/tika-download/tika.jar"
                 port="8077" />
</configure>wus@shell1:~/Plone/test1$ ls -la /home/wus/Plone/test1/parts/tika-download/tika.jar
-rw------- 1 wus wus 28628386 Aug  5 22:06 /home/wus/Plone/test1/parts/tika-download/tika.jar
lukasgraf commented 10 years ago

Ok, now at least the 999-additional-overrides.zcml has been generated. The problem now seems to be that ftw.tika's meta.zcml hasn't been loaded. Usually this is done automatically by z3c.autoinclude, but I don't really know how the unified installer's buildout is set up.

You should be able to load the ZCML explicitely by adding

zcml +=
    ftw.tika
    ftw.tika-meta

to your [client1] section (and then re-run buildout).

(The proper way to do this would be to create a policy product and in that policy package's setup.py declare a dependency on ftw.tika - this would load all required ZCML automatically for you)

wunderlins commented 10 years ago

Hi

Thanks for your quick help, I am very new to zone/plone and have been hitting a wall for days now.

adding the xcml line helped, but now I run into another problem, I guess this is a dependency issue. Do you recommend a specific configuration? My problem is that there is a lot of outdated information available online.

Here a new stack trace of [client1] in fg mode:

Traceback (most recent call last):
  File "/home/wus/Plone/test1/parts/client1/bin/interpreter", line 123, in <module>
    exec(compile(__file__f.read(), __file__, "exec"))
  File "/home/wus/Plone/buildout-cache/eggs/Zope2-2.13.22-py2.7.egg/Zope2/Startup/run.py", line 76, in <module>
    run()
  File "/home/wus/Plone/buildout-cache/eggs/Zope2-2.13.22-py2.7.egg/Zope2/Startup/run.py", line 22, in run
    starter.prepare()
  File "/home/wus/Plone/buildout-cache/eggs/Zope2-2.13.22-py2.7.egg/Zope2/Startup/__init__.py", line 86, in prepare
    self.startZope()
  File "/home/wus/Plone/buildout-cache/eggs/Zope2-2.13.22-py2.7.egg/Zope2/Startup/__init__.py", line 262, in startZope
    Zope2.startup()
  File "/home/wus/Plone/buildout-cache/eggs/Zope2-2.13.22-py2.7.egg/Zope2/__init__.py", line 47, in startup
    _startup()
  File "/home/wus/Plone/buildout-cache/eggs/Zope2-2.13.22-py2.7.egg/Zope2/App/startup.py", line 118, in startup
    load_zcml()
  File "/home/wus/Plone/buildout-cache/eggs/Zope2-2.13.22-py2.7.egg/Zope2/App/startup.py", line 52, in load_zcml
    load_site()
  File "/home/wus/Plone/buildout-cache/eggs/Zope2-2.13.22-py2.7.egg/Zope2/App/zcml.py", line 46, in load_site
    _context = xmlconfig.file(site_zcml)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/xmlconfig.py", line 649, in file
    include(context, name, package)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/xmlconfig.py", line 548, in include
    processxmlfile(f, context)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/xmlconfig.py", line 380, in processxmlfile
    parser.parse(src)
  File "/usr/lib/python2.7/xml/sax/expatreader.py", line 107, in parse
    xmlreader.IncrementalParser.parse(self, source)
  File "/usr/lib/python2.7/xml/sax/xmlreader.py", line 123, in parse
    self.feed(buffer)
  File "/usr/lib/python2.7/xml/sax/expatreader.py", line 207, in feed
    self._parser.Parse(data, isFinal)
  File "/usr/lib/python2.7/xml/sax/expatreader.py", line 349, in end_element_ns
    self._cont_handler.endElementNS(pair, None)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/xmlconfig.py", line 359, in endElementNS
    self.context.end()
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/config.py", line 558, in end
    self.stack.pop().finish()
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/config.py", line 706, in finish
    actions = self.handler(context, **args)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/xmlconfig.py", line 548, in include
    processxmlfile(f, context)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/xmlconfig.py", line 380, in processxmlfile
    parser.parse(src)
  File "/usr/lib/python2.7/xml/sax/expatreader.py", line 107, in parse
    xmlreader.IncrementalParser.parse(self, source)
  File "/usr/lib/python2.7/xml/sax/xmlreader.py", line 123, in parse
    self.feed(buffer)
  File "/usr/lib/python2.7/xml/sax/expatreader.py", line 207, in feed
    self._parser.Parse(data, isFinal)
  File "/usr/lib/python2.7/xml/sax/expatreader.py", line 349, in end_element_ns
    self._cont_handler.endElementNS(pair, None)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/xmlconfig.py", line 359, in endElementNS
    self.context.end()
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/config.py", line 558, in end
    self.stack.pop().finish()
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/config.py", line 706, in finish
    actions = self.handler(context, **args)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/xmlconfig.py", line 548, in include
    processxmlfile(f, context)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/xmlconfig.py", line 380, in processxmlfile
    parser.parse(src)
  File "/usr/lib/python2.7/xml/sax/expatreader.py", line 107, in parse
    xmlreader.IncrementalParser.parse(self, source)
  File "/usr/lib/python2.7/xml/sax/xmlreader.py", line 123, in parse
    self.feed(buffer)
  File "/usr/lib/python2.7/xml/sax/expatreader.py", line 207, in feed
    self._parser.Parse(data, isFinal)
  File "/usr/lib/python2.7/xml/sax/expatreader.py", line 349, in end_element_ns
    self._cont_handler.endElementNS(pair, None)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/xmlconfig.py", line 359, in endElementNS
    self.context.end()
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/config.py", line 558, in end
    self.stack.pop().finish()
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/config.py", line 706, in finish
    actions = self.handler(context, **args)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/xmlconfig.py", line 548, in include
    processxmlfile(f, context)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/xmlconfig.py", line 380, in processxmlfile
    parser.parse(src)
  File "/usr/lib/python2.7/xml/sax/expatreader.py", line 107, in parse
    xmlreader.IncrementalParser.parse(self, source)
  File "/usr/lib/python2.7/xml/sax/xmlreader.py", line 123, in parse
    self.feed(buffer)
  File "/usr/lib/python2.7/xml/sax/expatreader.py", line 207, in feed
    self._parser.Parse(data, isFinal)
  File "/usr/lib/python2.7/xml/sax/expatreader.py", line 338, in start_element_ns
    AttributesNSImpl(newattrs, qnames))
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/xmlconfig.py", line 232, in startElementNS
    self.context.begin(name, data, info)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/config.py", line 555, in begin
    self.stack.append(self.stack[-1].contained(__name, __data, __info))
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/config.py", line 856, in contained
    return RootStackItem.contained(self, name, data, info)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/config.py", line 727, in contained
    adapter = factory(self.context, data, info)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/config.py", line 1224, in factory
    return ComplexStackItem(self, context, data, info)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/config.py", line 970, in __init__
    args = toargs(newcontext, meta.schema, data)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/config.py", line 1397, in toargs
    args[str(name)] = field.fromUnicode(s)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/fields.py", line 137, in fromUnicode
    value = self.context.resolve(name)
  File "/home/wus/Plone/buildout-cache/eggs/zope.configuration-3.7.4-py2.7.egg/zope/configuration/config.py", line 179, in resolve
    mod = __import__(mname, *_import_chickens)
  File "/home/wus/Plone/buildout-cache/eggs/ftw.tika-1.1.1-py2.7.egg/ftw/tika/setuphandlers.py", line 2, in <module>
    from ftw.tika.transforms.tika_to_plain_text import Tika2TextTransform
  File "/home/wus/Plone/buildout-cache/eggs/ftw.tika-1.1.1-py2.7.egg/ftw/tika/transforms/tika_to_plain_text.py", line 4, in <module>
    from ftw.tika.converter import TikaConverter
  File "/home/wus/Plone/buildout-cache/eggs/ftw.tika-1.1.1-py2.7.egg/ftw/tika/converter.py", line 7, in <module>
    from plone.memoize import instance
zope.configuration.xmlconfig.ZopeXMLConfigurationError: File "/home/wus/Plone/test1/parts/client1/etc/site.zcml", line 15.2-15.55
    ZopeXMLConfigurationError: File "/home/wus/Plone/test1/parts/client1/etc/package-includes/002-ftw.tika-configure.zcml", line 1.0-1.52
    ZopeXMLConfigurationError: File "/home/wus/Plone/buildout-cache/eggs/ftw.tika-1.1.1-py2.7.egg/ftw/tika/configure.zcml", line 7.4-7.36
    ZopeXMLConfigurationError: File "/home/wus/Plone/buildout-cache/eggs/ftw.tika-1.1.1-py2.7.egg/ftw/tika/profiles.zcml", line 25.2
    ImportError: No module named memoize
lukasgraf commented 10 years ago

You're right, the current state of documentation for Plone is pretty awful, it's definitely not easy to get started. You are right though that recipe = plone.recipe.zope2instance should get inherited (assuming one of the buildouts that your buildout.cfg extends from also defines a [client1]. The fact that it doesn't seem to get inherited does indeed make me a bit sceptical as to if your buildout in general is set up correctly.

If you indirectly extend from multiple .cfgs things can easily get a bit messy, and the behavior of operations like += can get hard to predict (or is not even well defined in some cases).

Your current stacktrace does strike me as odd - plone.memoize is a core package and is used all over the place in Plone core.

I can think of one reason why this could happen though: The point of buildout is to isolate your installation and the packages in it. If however you somehow installed some Plone packages into your global site-packages, those could get picked up before. And because plone is a namespace package, those packages could shadow a plone.memoize that is actually there.

So I would try the following:

wunderlins commented 10 years ago

Hi Lukas

Ok, I got it running now. First of all, this is not a production site, I am only getting started with plone. I now installed a new vanilla debian VM and started out from scratch. I guess my python install was bogged.

The I have only installed plone 4.3.3 (unified installer) and copied the relevant sections from your git repo into the buildout.cfg.

Plus I had to add

zcml +=
    ftw.tika
    ftw.tika-meta

then I ran bin/buildout started tika-server and plone ... et voila!

I will, within the next couple of days, start adding additional features one by one and see if there is a conflict.

Thanks for your quick and competent help. I really like the tika integration. This (and workflow based publishing) is one of the killer features that is required for our new medical documentation system.

Greetings from basel -S

lukasgraf commented 10 years ago

That's great to hear, glad you got it working!