dkpro / dkpro-c4corpus

DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate removal, language detection, and near-duplicate removal.
https://dkpro.github.io/dkpro-c4corpus
Apache License 2.0
49 stars 8 forks source link

inconsistent package hierarchy and groupId #45

Open maxxkia opened 8 years ago

maxxkia commented 8 years ago

The project has been released with groupId

org.dkpro.c4corpus

But is still using the old package hierarchy i.e.

de.tudarmstadt.ukp.dkpro.c4corpus

This should be fixed. Causes confusion when referencing to classes inside the artifacts.

habernal commented 7 years ago

Also package de.tudarmstadt.ukp.dkpro.c4corpus.hadoop.CharsetDetector in dkpro-c4corpus-language should be renamed (no hadoop in there!)