Description:
I am managing a production server with limited storage capacity, where I can only allocate a specific amount of data (e.g., 2048 GiB or 2 TiB) for HuggingFace repositories. It would be extremely beneficial if olah could support setting a repository size limit and automatically remove the least accessed cached models/datasets to stay within the allocated storage space.
Proposed Solution:
Introduce a feature in olah that allows users to set a maximum repository size. When the limit is reached, olah would automatically delete the least recently accessed cached models/datasets to reclaim space, ensuring that the total storage usage does not exceed the defined limit (e.g., 2 TiB).
Current Behavior:
The current version 33e7cf1b30472bc9c9fdd0a71d49093bcccddac8 of olah I am using does not support setting a repository size limit or automatic model cleanup, which forces manual intervention (likely removing the entire repo :warning: ) to manage storage space effectively.
(.venv) olah@node1:~/olah$ pip freeze |grep olah
-e git+https://github.com/vtuber-plan/olah.git@33e7cf1b30472bc9c9fdd0a71d49093bcccddac8#egg=olah
(.venv) olah@node1:~/olah$ python -m olah.server --help
usage: server.py [-h] [--config CONFIG] [--host HOST] [--port PORT] [--hf-scheme HF_SCHEME] [--hf-netloc HF_NETLOC] [--hf-lfs-netloc HF_LFS_NETLOC] [--mirror-scheme MIRROR_SCHEME] [--mirror-netloc MIRROR_NETLOC] [--mirror-lfs-netloc MIRROR_LFS_NETLOC] [--has-lfs-site]
[--ssl-key SSL_KEY] [--ssl-cert SSL_CERT] [--repos-path REPOS_PATH] [--log-path LOG_PATH]
Olah Huggingface Mirror Server.
options:
-h, --help show this help message and exit
--config CONFIG, -c CONFIG
--host HOST
--port PORT
--hf-scheme HF_SCHEME
The scheme of huggingface site (http or https)
--hf-netloc HF_NETLOC
--hf-lfs-netloc HF_LFS_NETLOC
--mirror-scheme MIRROR_SCHEME
The scheme of mirror site (http or https)
--mirror-netloc MIRROR_NETLOC
--mirror-lfs-netloc MIRROR_LFS_NETLOC
--has-lfs-site
--ssl-key SSL_KEY The SSL key file path, if HTTPS is used
--ssl-cert SSL_CERT The SSL cert file path, if HTTPS is used
--repos-path REPOS_PATH
The folder to save cached repositories
--log-path LOG_PATH The folder to save logs
The repository cache disk usage limit and file-level cache automatic cleanup have been implemented in the newest version v0.3.0. You can set the capacity limit using --cache-size-limit.
Description: I am managing a production server with limited storage capacity, where I can only allocate a specific amount of data (e.g., 2048 GiB or 2 TiB) for HuggingFace repositories. It would be extremely beneficial if
olah
could support setting a repository size limit and automatically remove the least accessed cached models/datasets to stay within the allocated storage space.Proposed Solution: Introduce a feature in
olah
that allows users to set a maximum repository size. When the limit is reached,olah
would automatically delete the least recently accessed cached models/datasets to reclaim space, ensuring that the total storage usage does not exceed the defined limit (e.g., 2 TiB).Current Behavior: The current version 33e7cf1b30472bc9c9fdd0a71d49093bcccddac8 of
olah
I am using does not support setting a repository size limit or automatic model cleanup, which forces manual intervention (likely removing the entire repo :warning: ) to manage storage space effectively.