Closed limiao2 closed 6 years ago
Hi, if you have multiple GPUs, I think that the solution you propose is indeed the best one. Just separate your large corpus of 40M sentences to smaller chunks and run InferSent individually on this. Sorting the data so that you minimize the amount of padding and using large batches will also help here. Best, Alexis
I have a big corpus which includes roughly 40 million sentences. Is there any way that I can run this model in parallel? Is it better for me to divide the corpus to chunks and then eventually concatenate all matrix together? Thanks!