Closed Fethbita closed 6 years ago
Oh I understand. Instead of giving the whole data set, I can give indices and it won't have to allocate that much space.
doc_index = np.array(range(len(doctexts))).astype(int)
import pysparnn.cluster_index as ci
cp = ci.MultiClusterIndex(tfidfdtmatrix, doc_index)
fixed it. Is there an easy way to write the search index to disk?
Pickle package should work!
On Tue, Jul 10, 2018, 2:41 AM Burak Can notifications@github.com wrote:
Oh I understand. Instead of giving the whole data set, I can give indices and it won't have to allocate that much space.
doc_index = np.array(range(len(doctexts))).astype(int) import pysparnn.cluster_index as ci cp = ci.MultiClusterIndex(tfidfdtmatrix, doc_index)
fixed it. Is there an easy way to write the search index to disk?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/facebookresearch/pysparnn/issues/19#issuecomment-403763696, or mute the thread https://github.com/notifications/unsubscribe-auth/AAXLXZ3p2p5hJHFzXThZZWcmTQObVZdaks5uFHa6gaJpZM4VI_o8 .
Thanks.
When I try to build the search index just like in the example I get a
Memory Error:
. The data I use is wiki data, I have ~50 gb ram than can be used.