The Pimlico Processing Toolkit (PIpelined Modular LInguistic COrpus processing) is a toolkit for building pipelines made up of linguistic processing tasks to run on large datasets (corpora).
It provides a wrappers around many existing, widely used Natural Language Processing (NLP) tools. It makes it easy to write potentially complex pipelines and apply them to large datasets.
Pimlico aims:
Full documentation, including a guide on geting started using Pimlico, is available at http://pimlico.readthedocs.io.
Pimlico is hosted on Github