CLARIN-PL / Inforex

Inforex is a web system for text corpora construction.
Other
11 stars 9 forks source link

Wccl import memory fix2 #125

Closed seweryn626 closed 2 years ago

seweryn626 commented 2 years ago

This patch solves problems in corpus import script with hughe memory consuming in packet about 18k documents. After this optimalization loop of one documents consume about 150 bytes. It should allow to import packages to about 5 mln documents at once.