ScaleUnlimited / flink-crawler

Continuous scalable web crawler built on top of Flink and crawler-commons
Apache License 2.0
51 stars 18 forks source link

Fix Tika configuration #79

Open kkrugler opened 7 years ago

kkrugler commented 7 years ago

Currently at startup this gets logged:

Nov 15, 2017 10:17:30 AM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem
WARNING: org.xerial's sqlite-jdbc is not loaded.
Please provide the jar on your classpath to parse sqlite files.
See tika-parsers/pom.xml for the correct version.

We should exclude the sqlite parser - don't think we need it :)

kkrugler commented 6 years ago

See https://tika.apache.org/1.17/configuring.html#Load_Error_Handling