EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.
https://www.eleuther.ai
MIT License
7.06k stars 1.9k forks source link

Running multiple processes on a shared outlines cache database #2306

Open e-tornike opened 2 months ago

e-tornike commented 2 months ago

When using the caching feature of outlines with multiple running processes, the cache database may get accessed simultaneously, resulting in disk I/O or database disk image errors:

sqlite3.OperationalError: disk I/O error
sqlite3.DatabaseError: database disk image is malformed

This is further described in https://github.com/vllm-project/vllm/issues/4193 and https://github.com/dottxt-ai/outlines/issues/827.

A workaround for users, who don't use guided decoding, is described in https://github.com/vllm-project/vllm/pull/7831.

Another possible workaround is to set a unique cache directory for each process using the environment variable OUTLINES_CACHE_DIR.

An implementation in this framework (here) creates a separate cache database for each model "rank". Does the use_cache argument here aim to solve this issue, or does it have a different use?

baberabb commented 2 months ago

Hi! The use_cache currently implemented here is for caching evaluation results, so that you can continue where you left off in case of an error.

e-tornike commented 2 months ago

Okay, thank you.

Would there be interest in integrating a workaround (e.g., dynamically setting the OUTLINES_CACHE_DIR) into the framework?