Open ek-nyc opened 1 week ago
@ek-nyc could you please provide more detailed information?
Currently, there is a logic in VectorDBBench
to skip file downloads. If we detect that a file with the same name has the same size, we will skip the download.
https://github.com/zilliztech/VectorDBBench/blob/1ab46dd5d1594565148f8b90cc75b71ff11688e1/vectordb_bench/backend/data_source.py#L57-L66
Additionally, please note that the default download location is the /tmp
folder, which is typically cleared upon system reboot.
When I rerun my tests, it takes a long time to download the 3 files that already have been downloaded. The downloading steps should be skipped if the files already exist.