paracrawl / Domain_Adaptation

InDomain detection is a tool designed to extract in-domain data from a large collections of data.
GNU General Public License v3.0
1 stars 1 forks source link

Integrate into bitextor #5

Closed kpu closed 5 years ago

kpu commented 5 years ago

To be clear, the deliverable is that it should be integrated into the pipeline and Omniscien is responsible for this deliverable. A dump of scripts for us to integrate is not enough.

dionwiggins commented 5 years ago

This is a separate project from bitextor. It should not be integrated. This set of tools works on the final output of bitextors output and other collections of parallel corpora.