Update to v0.2: a more flexible framework for compressing index of any dense retrieval models

This is a major code update. The previous code is deprecated. Here are several features:

Flexible code framework. The previous code has a preprocessing process, which is abandoned now. Tokenization is done during training and inference. JPQ and RepCONC no longer depend on a certain dense retrieval model structure. They are two training instances that accept a dense retrieval model as input. So dense models of different architectures can all be the input of JPQ and RepCONC.
Support distributed training for RepCONC.
Support large batch sizes for RepCONC with GradCache
Already added and will add more examples about transferring dense retrieval models into memory-efficient ones.

jingtaozhan / RepCONC