Open yaxundai opened 5 months ago
Hi @KAGAII, make sure you update neural-cherche using pip install neural-cherche --upgrade
to get the 1.4.3 version.
from neural_cherche import models, rank, retrieve, utils
device = "cpu" # or "mps" or "conda"
documents, queries, qrels = utils.load_beir(
"arguana",
split="test",
)
retriever = retrieve.BM25(
key="id",
on=["title", "text"],
)
ranker = rank.ColBERT(
key="id",
on=["title", "text"],
model=models.ColBERT(
model_name_or_path="raphaelsty/neural-cherche-colbert",
device=device,
).to(device),
)
retriever = retriever.add(
documents_embeddings=retriever.encode_documents(
documents=documents,
)
)
candidates = retriever(
queries_embeddings=retriever.encode_queries(
queries=queries,
),
k=30,
tqdm_bar=True,
)
batch_size = 32
scores = ranker(
documents=candidates,
queries_embeddings=ranker.encode_queries(
queries=queries,
batch_size=batch_size,
tqdm_bar=True,
),
documents_embeddings=ranker.encode_candidates_documents(
candidates=candidates,
documents=documents,
batch_size=batch_size,
tqdm_bar=True,
),
k=10,
)
scores = utils.evaluate(
scores=scores,
qrels=qrels,
queries=queries,
metrics=["ndcg@10"] + [f"hits@{k}" for k in range(1, 11)],
)
print(scores)
Yield
{
"ndcg@10": 0.3686831610778578,
"hits@1": 0.01386748844375963,
"hits@2": 0.27889060092449924,
"hits@3": 0.40061633281972264,
"hits@4": 0.4861325115562404,
"hits@5": 0.5562403697996918,
"hits@6": 0.6194144838212635,
"hits@7": 0.6556240369799692,
"hits@8": 0.6887519260400616,
"hits@9": 0.7218798151001541,
"hits@10": 0.74884437596302,
}
which are good scores, it run in 3 min on mps device. The results you get are due do duplicates queries which are now handled by the evaluation of neural-cherche.
EDIT: sorry I just saw you mention sparse embed a not colbert, running benchmark
@KAGAII There is definitely something wrong with SparseEmbed right now, we recently updated SparseEmbed but we may need to update it back to the previous version @arthur-75. I'll make an update in the following days
Thank you for your prompt reply, looking forward to the new version!
When I used the pre-trained model 'raphaelsty/neural-cherche-sparse-embed' to evaluate the dataset, specifically, the arguana dataset, with a retrieval k value of 100, the result was very poor {'map': 0.033567943638956016, 'ndcg@10': 0.042417859280348115, 'ndcg@100': 0.08691780846498275, 'recall@10': 0.09815078236130868, 'recall@100': 0.32147937411095306} As shown above, ndcg is only 4.2%