Closed Narabzad closed 4 years ago
The score should change... did you rescore by the new scores?
# Finally, rerank:
reranked = reranker.rerank(query, texts)
reranked.sort(key=lambda x: x.score, reverse=True)
I get different scores but the order is the same even after reranking. I tried the SequenceClassificationTransformerReranker one and It works. However, it does not work for T5 ranker and the example do not change the orders as well. :
This is the code that works! I found it here: https://github.com/castorini/pygaggle/pull/58
import torch from transformers import BertTokenizer,BertForSequenceClassification from pygaggle.rerank.base import Query, Text from pygaggle.rerank.transformer import SequenceClassificationTransformerReranker model_name = 'castorini/monobert-large-msmarco' tokenizer_name = 'bert-large-uncased' batch_size = 8 device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') model = BertForSequenceClassification.from_pretrained("castorini/monobert-large-msmarco") model = model.to(device).eval() tokenizer = BertTokenizer.from_pretrained(tokenizer_name) reranker = SequenceClassificationTransformerReranker(model, tokenizer) query = Query('how old are you?') doc1 = Text('I am 77 years old') doc2 = Text('I am hungry') doc3=Text('My age is 77') doc4=Text('I want to sleep early') documents = [doc1,doc2,doc3,doc4] scores = [result.score for result in reranker.rerank(query, documents)] print(scores)
You need to re-sort the list based on the scores, as I've done in my code snippet above.
I did resort the list based on the scores the same as in the example:
This is the code I am using :
import torch
from transformers import AutoTokenizer, T5ForConditionalGeneration
from pygaggle.model import T5BatchTokenizer
from pygaggle.rerank.base import Query, Text
from pygaggle.rerank.transformer import T5Reranker
model_name = 'castorini/monot5-base-msmarco'
tokenizer_name = 't5-base'
batch_size = 8
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = T5ForConditionalGeneration.from_pretrained(model_name)
model = model.to(device).eval()
tokenizer = AutoTokenizer.from_pretrained(tokenizer_name)
tokenizer = T5BatchTokenizer(tokenizer, batch_size)
reranker = T5Reranker(model, tokenizer)
query = Query('who proposed the geocentric theory')
passages = [ ['2593796', 'Copernicus proposed a heliocentric model of the solar system â\x80\x93 a model where everything orbited around the Sun. Today, with advancements in science and technology, the geocentric model seems preposterous.he geocentric model, also known as the Ptolemaic system, is a theory that was developed by philosophers in Ancient Greece and was named after the philosopher Claudius Ptolemy who lived circa 90 to 168 A.D. It was developed to explain how the planets, the Sun, and even the stars orbit around the Earth.'], ['6217200', 'The geocentric model, also known as the Ptolemaic system, is a theory that was developed by philosophers in Ancient Greece and was named after the philosopher Claudius Ptolemy who lived circa 90 to 168 A.D. It was developed to explain how the planets, the Sun, and even the stars orbit around the Earth.opernicus proposed a heliocentric model of the solar system â\x80\x93 a model where everything orbited around the Sun. Today, with advancements in science and technology, the geocentric model seems preposterous.'], ['3276925', 'Copernicus proposed a heliocentric model of the solar system â\x80\x93 a model where everything orbited around the Sun. Today, with advancements in science and technology, the geocentric model seems preposterous.Simple tools, such as the telescope â\x80\x93 which helped convince Galileo that the Earth was not the center of the universe â\x80\x93 can prove that ancient theory incorrect.ou might want to check out one article on the history of the geocentric model and one regarding the geocentric theory. Here are links to two other articles from Universe Today on what the center of the universe is and Galileo one of the advocates of the heliocentric model.'], ['6217208', 'Copernicus proposed a heliocentric model of the solar system â\x80\x93 a model where everything orbited around the Sun. Today, with advancements in science and technology, the geocentric model seems preposterous.Simple tools, such as the telescope â\x80\x93 which helped convince Galileo that the Earth was not the center of the universe â\x80\x93 can prove that ancient theory incorrect.opernicus proposed a heliocentric model of the solar system â\x80\x93 a model where everything orbited around the Sun. Today, with advancements in science and technology, the geocentric model seems preposterous.'], ['4280557', 'The geocentric model, also known as the Ptolemaic system, is a theory that was developed by philosophers in Ancient Greece and was named after the philosopher Claudius Ptolemy who lived circa 90 to 168 A.D. It was developed to explain how the planets, the Sun, and even the stars orbit around the Earth.imple tools, such as the telescope â\x80\x93 which helped convince Galileo that the Earth was not the center of the universe â\x80\x93 can prove that ancient theory incorrect. You might want to check out one article on the history of the geocentric model and one regarding the geocentric theory.'], ['264181', 'Nicolaus Copernicus (b. 1473â\x80\x93d. 1543) was the first modern author to propose a heliocentric theory of the universe. From the time that Ptolemy of Alexandria (c. 150 CE) constructed a mathematically competent version of geocentric astronomy to Copernicusâ\x80\x99s mature heliocentric version (1543), experts knew that the Ptolemaic system diverged from the geocentric concentric-sphere conception of Aristotle.'], ['4280558', 'A Geocentric theory is an astronomical theory which describes the universe as a Geocentric system, i.e., a system which puts the Earth in the center of the universe, and describes other objects from the point of view of the Earth. Geocentric theory is an astronomical theory which describes the universe as a Geocentric system, i.e., a system which puts the Earth in the center of the universe, and describes other objects from the point of view of the Earth.'], ['3276926', 'The geocentric model, also known as the Ptolemaic system, is a theory that was developed by philosophers in Ancient Greece and was named after the philosopher Claudius Ptolemy who lived circa 90 to 168 A.D. It was developed to explain how the planets, the Sun, and even the stars orbit around the Earth.ou might want to check out one article on the history of the geocentric model and one regarding the geocentric theory. Here are links to two other articles from Universe Today on what the center of the universe is and Galileo one of the advocates of the heliocentric model.'], ['7744105', 'For Earth-centered it was Geocentric Theory proposed by greeks under the guidance of Ptolemy and Sun-centered was Heliocentric theory proposed by Nicolas Copernicus in 16th century A.D. In short, Your Answers are: 1st blank - Geo-Centric Theory. 2nd blank - Heliocentric Theory.'],['5183032', "After 1,400 years, Copernicus was the first to propose a theory which differed from Ptolemy's geocentric system, according to which the earth is at rest in the center with the rest of the planets revolving around it."]]
texts = [ Text(p[1], {'docid': p[0]}, 0) for p in passages] # Note, pyserini scores don't matter since T5 will ignore them.
print('prioir to reranking')
for i in range(0, 10):
print(f'{i+1:2} {texts[i].metadata["docid"]:15} {texts[i].score:.5f}')
reranked = reranker.rerank(query, texts)
reranked.sort(key=lambda x: x.score, reverse=True)
print('reranked results:')
for i in range(0, 10):
print(f'{i+1:2} {texts[i].metadata["docid"]:15} {reranked[i].score:.5f} ')
and this is the results I got :
1 2593796 0.00000 2 6217200 0.00000 3 3276925 0.00000 4 6217208 0.00000 5 4280557 0.00000 6 264181 0.00000 7 4280558 0.00000 8 3276926 0.00000 9 7744105 0.00000 10 5183032 0.00000 reranked results: 1 2593796 -0.01113 2 6217200 -0.01206 3 3276925 -0.02000 4 6217208 -0.02684 5 4280557 -0.03215 6 264181 -0.03442 7 4280558 -0.03664 8 3276926 -0.03727 9 7744105 -0.03800 10 5183032 -2.58163
As you can see, the order of documents sorted based on their scores are exactly the same.
reranked.sort(key=lambda x: x.score, reverse=True)
print('reranked results:')
for i in range(0, 10):
print(f'{i+1:2} {texts[i].metadata["docid"]:15} {reranked[i].score:.5f} ')
^^^^^^^
You have a bug in your code: you're still printing out the metadata in texts
.
Thank you so much! It is working now! The bug is in the example in the readme file here as well. It should be fixed.
Ah, I see. Can you please fix the bug and send a PR?
Done!
Hello, When I try to run the given example. The documents will show up in the same order and no reranking happens. I tried with the two models and with different document samples. The order will not change at all. So if I run the given example in the last section of the readme, should I get reranked passages? Why am I not seeing any differences in documents order?