Closed kpu closed 4 years ago
https://github.com/paracrawl/Domain_Adaptation/blob/432916d54f537342bcffb30f4968c2e19a5be98e/scripts/ScorePoolData.py#L153
You don't need the whole corpus in RAM. Stream it. This appears to be done so you can do XML or something. Which is overkill for one extra column of data.
https://github.com/paracrawl/Domain_Adaptation/blob/432916d54f537342bcffb30f4968c2e19a5be98e/scripts/ScorePoolData.py#L153
You don't need the whole corpus in RAM. Stream it.
This appears to be done so you can do XML or something. Which is overkill for one extra column of data.