Reduce RAM high usage during evaluation

Description

The current implementation loads the whole dataset in the RAM during evaluation, which causes unneeded high RAM spikes. This PR reuses the batching strategy already used during inference in a pre-batching step to drastically reduce the RAM usage.

Features

Changed

Add better RAM handling during evaluation

Fixed

Fix issue when no pretrained_model_name_or_path is None in load_vision_retriever_from_registry

Test

E2E tested with:

vidore-benchmark evaluate-retriever \
    --model-class siglip \
    --model-name google/siglip-so400m-patch14-384 \
    --dataset-name vidore/shiftproject_test \
    --split test

illuin-tech / vidore-benchmark

Reduce RAM high usage during evaluation #56

Description

Features

Changed

Fixed

Test