DKPro BigData enables the easy execution of UIMA-based natural language processing pipelines on a hadoop cluster.
Large scale NLP processing using UIMA and hadoop Store your corpora on a Hadoop filesystem and access them from local or distributed pipelines Find patterns in your textual data using adaptable collocation extraction
Execute DKPro pipelines on a hadoop cluster with minimal adaption
Read data stored on a HDFS Filesystem using DKPro Collection Readers
Read/Write serialized CASes from HDFS
Hans-Peter Zorn
Johannes Simon
Martin Riedl
Richard Eckart de Castilho
Steffen Remus
DKPro BigData is licensed under the Apache Software Licence (ASL) Version 2.0.
This project is a joint effort of UKP Lab and the Language Technology Group, Technical University of Darmstadt.