When downloading the model, if the python API is used and
a download_path is specified, such that the download path has
the home directory ~ in it; it leads to file not found error
when unzipping tar files.
python local/scripts/deployment_dir_bug.py --small-model
Downloading (…)training/config.json: 100%|██████████████████████████████████████| 0.98k/0.98k [00:00<00:00, 377kB/s]
Downloading (…)okenizer_config.json: 100%|█████████████████████████████████████████| 240/240 [00:00<00:00, 95.5kB/s]
Downloading (…)/training/merges.txt: 100%|███████████████████████████████████████| 446k/446k [00:00<00:00, 8.83MB/s]
Downloading (…)g/model_nocache.onnx: 100%|███████████████████████████████████████| 496M/496M [00:43<00:00, 12.0MB/s]
Downloading (…)cial_tokens_map.json: 100%|███████████████████████████████████████| 90.0/90.0 [00:00<00:00, 18.9kB/s]
Downloading (…)/training/vocab.json: 100%|███████████████████████████████████████| 779k/779k [00:00<00:00, 10.6MB/s]
Downloading (…)ining/tokenizer.json: 100%|█████████████████████████████████████| 2.02M/2.02M [00:00<00:00, 10.7MB/s]
Downloading (…)el/deployment.tar.gz: 100%|███████████████████████████████████████| 265M/265M [00:23<00:00, 12.0MB/s]
[Errno 2] No such file or directory: '~/test-models/small-model/deployment.tar.gz'
Traceback (most recent call last):
File "/home/rahul/projects/sparsezoo/src/sparsezoo/objects/directory.py", line 190, in download
target_directory.unzip()
File "/home/rahul/projects/sparsezoo/src/sparsezoo/objects/directory.py", line 306, in unzip
tar = tarfile.open(self._path, "r")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/tarfile.py", line 1804, in open
return func(name, "r", fileobj, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/tarfile.py", line 1870, in gzopen
fileobj = GzipFile(name, mode + "b", compresslevel, fileobj)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/gzip.py", line 174, in __init__
fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '~/test-models/small-model/deployment.tar.gz'
Trying attempt 1 of 1.
Download retry failed...
Issue
The issue is that the ~ is not expanded to the home directory
when the download path is specified. This is a bug in the
sparsezoo python API.
Test Script
# deployment_dir_bug.py
import argparse
from sparsezoo import Model
def parse_args():
parser = argparse.ArgumentParser(description="Test Download Bug")
parser = argparse.ArgumentParser(description='Download models.')
group = parser.add_mutually_exclusive_group(required=True)
group.add_argument('--big-model', action='store_true', help='Download big model')
group.add_argument('--small-model', action='store_true', help='Download small model')
parser.add_argument('--download-path', type=str, required=False, help='Path to download the model', default=None)
return parser.parse_args()
def main():
args = parse_args()
if args.big_model:
stub = "zoo:llama2-7b-ultrachat200k_llama2_pretrain-pruned80"
potential_download_path = "~/test-models/big-model"
else:
stub = "zoo:codegen_mono-350m-bigpython_bigquery_thepile-pruned50_quantized"
potential_download_path = "~/test-models/small-model"
download_path = args.download_path if args.download_path else potential_download_path
sparsezoo_model = Model(stub, download_path=download_path)
downloaded_path = sparsezoo_model.download()
print(f"Downloaded Model contents to {downloaded_path=}")
print(f"Sparsezoo Model: {sparsezoo_model=}")
if __name__ == "__main__":
main()
Steps to Reproduce
Invoke the script with the --small-model flag, we should see the error.
After this PR
The issue should be resolved and the deployment tar should be found.
Deployment tar not found bug
When downloading the model, if the python API is used and a
download_path
is specified, such that the download path has the home directory~
in it; it leads to file not found error when unzipping tar files.Issue
The issue is that the
~
is not expanded to the home directory when the download path is specified. This is a bug in thesparsezoo
python API.Test Script
Steps to Reproduce
Invoke the script with the
--small-model
flag, we should see the error.After this PR
The issue should be resolved and the deployment tar should be found.