Hi, I am trying to index a very large corpus, for which I am creating mini batches and sending to
indexer.index( name="msmarco.nbits-2", collection=batch, overwrite='resume' )
Even tough I am specifying resume, I am not able to see any changes in the experiments directory file size. Is there a way to do it in the mini batches. My work station hangs if I pass the whole corpus, my CPU and GPU both has 48 GB RAM.
Thank You
EDIT: sorry, it was problem with my data, I was able to index the corpus. Thanks
Hi, I am trying to index a very large corpus, for which I am creating mini batches and sending to
indexer.index( name="msmarco.nbits-2", collection=batch, overwrite='resume' )
Even tough I am specifying resume, I am not able to see any changes in the experiments directory file size. Is there a way to do it in the mini batches. My work station hangs if I pass the whole corpus, my CPU and GPU both has 48 GB RAM.
Thank You
EDIT: sorry, it was problem with my data, I was able to index the corpus. Thanks