-
Hi ColBERT team,
I have a question regarding the Readme file, in particular the training part:
trainer = Trainer(
triples="/path/to/MSMARCO/triples.train.small.tsv",
…
-
It would be good to have validation during training.
I understand this might be hard for dense retrievers as encoding collection is expansive.
But I guess we can do something like use DR to rerank…
-
# Search pipelines
This RFC is intended to replace https://github.com/opensearch-project/search-processor/issues/12.
## Overview
We are proposing a set of new APIs to manage composable proces…
-
In the **Notes** section, there is a passage below:
> - Synonyms can be added in any order. The ordering is not considered in any computational logic.
that implies synonym orders would not impac…
-
Hi, when reconstructing vectors from codes and residuals I always get zero vectors. The relevant code is related to the torch extensions. See minimal example:
```python
from colbert.indexing.codec…
-
Hi, I was wondering how you got the 6980 data in the marco_dev folder from the MS MARCO dev set?
best wishes!
-
In the course of considering the list question at , I took a slightly-deeper look at `gensim.summarization` than before.
From that look, my opinion is that its presence is more likely to waste peo…
-
This is an issue that I am opening for discussion.
**Problem**:
Sample weights (in various estimators), group labels (for cross-validation objects), group id (in learning to rank) are optional infor…
-
Could you link the data (`MSMARCO/triples.train.small.tsv`", `MSMARCO/queries.train.small.tsv`, `MSMARCO/collection.tsv`) used on the training script below:
```python
from colbert.infra import Ru…
-
## Problem Statement
Traditionally, OpenSearch has relied on keyword matching for search result ranking. From a high level, these ranking techniques work by scoring documents based on the relative …