Closed xszheng2020 closed 2 years ago
Hi @xszheng2020 , Thank you for your interest in our work!
Can you please elaborate on what are you trying to do? Are you trying to write the datastore distributed-ly or read id distributedly?
Uri t
Hi, @urialon
Sorry for the ambiguity. I want to write a datastore distributed-ly when evaluating a Language Model on the training corpus.
Thanks!
Hi @xszheng2020 , It's currently not implemented, but I think that it should be possible: if each distributed process gets a distinct part of the training set, performs a forward pass on that chunk, and writes its own datastore - eventually you only need to concatenate all mini-datastores.
Are you asking about it because you are dealing with a huge datastore?
Hi, @urialon Yes, I am dealing with a huge datastore. I think your idea that splitting the training set into distinct parts should work. I would have a try. Thanks.
If the datastore is huge and there is not enough disk space, you might be able to avoid concatenating all the keys. When you build the FAISS index, you can iterate on all mini-datastores and insert their keys into the FAISS index, and finally delete all mini-datastore-keys, without ever needing to concatenate them. You will only need to concatenate the values, because they are used at test time, but they are much more lightweight than the keys.
Good luck! Let me know if you have any questions.
Hi, @urialon I am trying to evaluate a model using the DDP strategy but met an error because such strategies will try to write the datastore asynchronously.
Using only one GPU then everything works really well but it is kinda slow.
Any idea? Thanks!