A quick tip: If you change LLMReRanker to inherit from pt.Transformer and rename rerank_pyt to transform, you can avoid the lambda. And if you use pt.text.get_text() in the pipeline, it'll add the document's text into the dataframe, so you don't need to load them all into memory.
Awesome, thanks @kaustubhdhole!
A quick tip: If you change
LLMReRanker
to inherit frompt.Transformer
and renamererank_pyt
totransform
, you can avoid the lambda. And if you usept.text.get_text()
in the pipeline, it'll add the document's text into the dataframe, so you don't need to load them all into memory.