Open ftvalentini opened 1 month ago
In the main class for building dense indexes:
https://github.com/castorini/pyserini/blob/b7e1da305dd31b195244d49321087505996260c6/pyserini/encode/_auto.py#L24
the arg fp16 does not seem to be used anywhere (it is kwargs in this line: https://github.com/castorini/pyserini/blob/b7e1da305dd31b195244d49321087505996260c6/pyserini/encode/_auto.py#L38). It could be included in the __init__ as:
fp16
__init__
self.model = AutoModel.from_pretrained(model_name, torch_dtype=torch.float16)
Moreover, the encode() method coulde use inference_mode(), like:
with torch.inference_mode(): outputs = self.model(**inputs)
which would significantly reduce the memory footprint of inference.
In the main class for building dense indexes:
https://github.com/castorini/pyserini/blob/b7e1da305dd31b195244d49321087505996260c6/pyserini/encode/_auto.py#L24
the arg
fp16
does not seem to be used anywhere (it is kwargs in this line: https://github.com/castorini/pyserini/blob/b7e1da305dd31b195244d49321087505996260c6/pyserini/encode/_auto.py#L38). It could be included in the__init__
as:Moreover, the encode() method coulde use inference_mode(), like:
which would significantly reduce the memory footprint of inference.