If a specific GPU is not instantiated for the BGEReranker this causes an underlying issue, where it would use a cpu instead. I modified this to use the gpu with the most available memory instead.
some users might not be familiar with spaCy, so raised an exception to hint how to load the model.
Description of changes:
other issues included some legacy prompts that were not used (llm-rerankers)
some legacy code for retrieving embeddings which are not needed in this open-source implementation
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Issue #, if available:
Description of changes:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.