With this pull request we add the functionality for complex workflows. Complex workflow means for example: One file is the input for two different extractors or the database is connected to more than one extractor. The following picture is more informative. For this functionality, we are using multi-threading. Currently, we have three different kind of threads: PullTask, ExtractTask, StoreTask. These tasks represent the logic classes. The logic of the task executer allows to extend the thread pool for further task.
Complex workflows are necessary for further functionality like ensemble learning or filter nodes.
Usage:
In the current version, only FOX and the Cederic Extractor are supported. Unfortunately, Cederic produces just rubbish.
FOX returns the Turtle format back, afterwards we parse this with the Apache Jena library to N-triples format. The parse function is located in the SASK-commons. Cederic produce N-triples by itself without additional parsing. The database expects N-triples.
With this pull request we add the functionality for complex workflows. Complex workflow means for example: One file is the input for two different extractors or the database is connected to more than one extractor. The following picture is more informative. For this functionality, we are using multi-threading. Currently, we have three different kind of threads: PullTask, ExtractTask, StoreTask. These tasks represent the logic classes. The logic of the task executer allows to extend the thread pool for further task. Complex workflows are necessary for further functionality like ensemble learning or filter nodes.
Usage: In the current version, only FOX and the Cederic Extractor are supported. Unfortunately, Cederic produces just rubbish. FOX returns the Turtle format back, afterwards we parse this with the Apache Jena library to N-triples format. The parse function is located in the SASK-commons. Cederic produce N-triples by itself without additional parsing. The database expects N-triples.