markgw / pimlico

The Pimlico Processing Toolkit
http://pimlico.readthedocs.org/
GNU Lesser General Public License v3.0
6 stars 1 forks source link
data-analysis dataset natural-language-processing nlp pipeline python
Logo

The Pimlico Processing Toolkit

The Pimlico Processing Toolkit (PIpelined Modular LInguistic COrpus processing) is a toolkit for building pipelines made up of linguistic processing tasks to run on large datasets (corpora).

It provides a wrappers around many existing, widely used Natural Language Processing (NLP) tools. It makes it easy to write potentially complex pipelines and apply them to large datasets.

Pimlico aims:

Full documentation, including a guide on geting started using Pimlico, is available at http://pimlico.readthedocs.io.

Pimlico is hosted on Github