Open jankounchained opened 11 months ago
We have duplicate documents. And documents that are in other languages than English.
Need to remove other languages than English. But duplicates should probably only be flagged, because they could still be relevant for the reseach question.
goal of lang detect:
We have duplicate documents. And documents that are in other languages than English.
Need to remove other languages than English. But duplicates should probably only be flagged, because they could still be relevant for the reseach question.