kosmos-2.5 | trying to connect to `openaipublic.blob.core.windows.net`

Describe the bug Model I am using (UniLM, MiniLM, LayoutLM ...): kosmos-2.5
The problem arises when using:
[ ] the official example scripts: (give details below)
[x] my own modified scripts: (give details below)
A clear and concise description of what the bug is.
To Reproduce Steps to reproduce the behavior:
git clone https://github.com/microsoft/unilm.git
cd unilm/kosmos-2.5
pip install -r requirements.txt
wget -O ckpt.pt https://huggingface.co/microsoft/kosmos-2.5/resolve/main/ckpt.pt?download=true
mv 'ckpt.pt?download=true' ./weights/ckpt.pt
detach network in from wifi router
write a wrapper for do-ocr in a file called app.py
python3 app.py
Expected behavior It should have run the program without any issues but it is not, although all the models have been downloaded already! I have tried connecting to internet again, it works. When I disconnect again, it stops again?!!!
This app.py is just a flask wrapper for ocr. Nothing else. But if it required I am willing to share that as well.
Here is the related log:
maifee@maifee-cudaserver:~/ocr/unilm/kosmos-2.5$ python3 app.py 
Using flash_attn
 * Serving Flask app 'app'
 * Debug mode: off
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on http://127.0.0.1:5000
Press CTRL+C to quit
127.0.0.1 - - [20/Jul/2024 19:40:31] "OPTIONS /predict HTTP/1.1" 200 -
/home/maifee/.local/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Some weights of Pix2StructVisionModel were not initialized from the model checkpoint at google/pix2struct-large and are newly initialized: ['encoder.layer.6.attention.output.weight', 'encoder.layer.17.attention.value.weight', 'encoder.layer.0.pre_mlp_layer_norm.weight', 'encoder.layer.3.attention.value.weight', 'encoder.layer.12.attention.value.weight', 'encoder.layer.13.mlp.wo.weight', 'encoder.layer.14.pre_mlp_layer_norm.weight', 'encoder.layer.7.attention.key.weight', 'encoder.layer.12.mlp.wi_1.weight', 'encoder.layer.12.attention.output.weight', 'encoder.layer.14.attention.key.weight', 'encoder.layer.13.attention.value.weight', 'encoder.layer.7.attention.value.weight', 'encoder.layer.4.attention.query.weight', 'encoder.layer.12.pre_attention_layer_norm.weight', 'encoder.layer.5.mlp.wi_0.weight', 'encoder.layer.11.pre_attention_layer_norm.weight', 'encoder.layer.10.mlp.wi_1.weight', 'encoder.layer.9.mlp.wo.weight', 'encoder.layer.6.attention.value.weight', 'encoder.layer.12.mlp.wo.weight', 'encoder.layer.3.mlp.wi_1.weight', 'encoder.layer.15.pre_mlp_layer_norm.weight', 'encoder.layer.10.attention.key.weight', 'encoder.layer.2.attention.key.weight', 'encoder.layer.1.pre_mlp_layer_norm.weight', 'encoder.layer.4.mlp.wo.weight', 'encoder.layer.3.pre_mlp_layer_norm.weight', 'encoder.layer.13.attention.query.weight', 'encoder.layer.5.attention.value.weight', 'encoder.layer.6.pre_attention_layer_norm.weight', 'encoder.layer.3.attention.query.weight', 'encoder.layer.8.attention.query.weight', 'encoder.layer.8.mlp.wi_1.weight', 'encoder.layer.16.mlp.wi_0.weight', 'encoder.layer.7.attention.output.weight', 'encoder.layer.8.mlp.wo.weight', 'encoder.layer.6.mlp.wi_0.weight', 'encoder.layer.14.mlp.wo.weight', 'encoder.layer.7.pre_attention_layer_norm.weight', 'encoder.layer.10.mlp.wi_0.weight', 'encoder.layer.10.pre_mlp_layer_norm.weight', 'encoder.layer.0.attention.output.weight', 'encoder.layer.12.attention.key.weight', 'encoder.layer.16.pre_mlp_layer_norm.weight', 'encoder.layer.8.pre_mlp_layer_norm.weight', 'encoder.layer.13.pre_attention_layer_norm.weight', 'encoder.layer.1.mlp.wo.weight', 'encoder.layer.0.mlp.wi_1.weight', 'embeddings.column_embedder.weight', 'encoder.layer.14.mlp.wi_1.weight', 'encoder.layer.9.mlp.wi_0.weight', 'encoder.layer.15.pre_attention_layer_norm.weight', 'encoder.layer.17.pre_mlp_layer_norm.weight', 'encoder.layer.8.pre_attention_layer_norm.weight', 'encoder.layer.0.mlp.wi_0.weight', 'encoder.layer.17.pre_attention_layer_norm.weight', 'encoder.layer.16.attention.value.weight', 'encoder.layer.4.pre_mlp_layer_norm.weight', 'encoder.layer.2.attention.value.weight', 'encoder.layer.3.mlp.wi_0.weight', 'encoder.layer.3.attention.key.weight', 'encoder.layer.4.mlp.wi_1.weight', 'encoder.layer.5.pre_mlp_layer_norm.weight', 'encoder.layer.12.mlp.wi_0.weight', 'encoder.layer.3.mlp.wo.weight', 'encoder.layer.3.attention.output.weight', 'encoder.layer.8.attention.value.weight', 'encoder.layer.11.pre_mlp_layer_norm.weight', 'encoder.layer.17.attention.output.weight', 'encoder.layer.15.attention.value.weight', 'encoder.layer.15.attention.output.weight', 'encoder.layer.7.mlp.wi_1.weight', 'encoder.layer.12.pre_mlp_layer_norm.weight', 'embeddings.patch_projection.bias', 'encoder.layer.5.attention.query.weight', 'encoder.layer.9.attention.value.weight', 'encoder.layer.5.mlp.wi_1.weight', 'encoder.layer.0.attention.key.weight', 'embeddings.patch_projection.weight', 'encoder.layer.1.attention.value.weight', 'embeddings.row_embedder.weight', 'encoder.layer.15.mlp.wi_0.weight', 'encoder.layer.15.mlp.wi_1.weight', 'encoder.layer.13.attention.key.weight', 'encoder.layer.10.attention.query.weight', 'encoder.layer.2.mlp.wo.weight', 'encoder.layer.17.attention.query.weight', 'encoder.layer.15.mlp.wo.weight', 'encoder.layer.10.attention.output.weight', 'encoder.layer.4.mlp.wi_0.weight', 'encoder.layer.6.mlp.wo.weight', 'encoder.layer.9.attention.key.weight', 'encoder.layer.1.attention.key.weight', 'layernorm.weight', 'encoder.layer.6.mlp.wi_1.weight', 'encoder.layer.17.mlp.wi_1.weight', 'encoder.layer.16.pre_attention_layer_norm.weight', 'encoder.layer.9.pre_mlp_layer_norm.weight', 'encoder.layer.11.attention.query.weight', 'encoder.layer.2.attention.query.weight', 'encoder.layer.2.pre_attention_layer_norm.weight', 'encoder.layer.4.attention.key.weight', 'encoder.layer.10.attention.value.weight', 'encoder.layer.5.mlp.wo.weight', 'encoder.layer.14.attention.output.weight', 'encoder.layer.9.mlp.wi_1.weight', 'encoder.layer.17.attention.key.weight', 'encoder.layer.14.attention.query.weight', 'encoder.layer.9.attention.output.weight', 'encoder.layer.14.attention.value.weight', 'encoder.layer.9.pre_attention_layer_norm.weight', 'encoder.layer.11.attention.output.weight', 'encoder.layer.7.mlp.wi_0.weight', 'encoder.layer.8.attention.output.weight', 'encoder.layer.6.attention.key.weight', 'encoder.layer.13.mlp.wi_1.weight', 'encoder.layer.0.pre_attention_layer_norm.weight', 'encoder.layer.4.attention.value.weight', 'encoder.layer.1.pre_attention_layer_norm.weight', 'encoder.layer.15.attention.query.weight', 'encoder.layer.2.mlp.wi_1.weight', 'encoder.layer.1.attention.query.weight', 'encoder.layer.13.mlp.wi_0.weight', 'encoder.layer.10.mlp.wo.weight', 'encoder.layer.13.attention.output.weight', 'encoder.layer.11.mlp.wo.weight', 'encoder.layer.16.attention.output.weight', 'encoder.layer.11.attention.value.weight', 'encoder.layer.2.attention.output.weight', 'encoder.layer.8.mlp.wi_0.weight', 'encoder.layer.12.attention.query.weight', 'encoder.layer.0.mlp.wo.weight', 'encoder.layer.2.pre_mlp_layer_norm.weight', 'encoder.layer.5.attention.key.weight', 'encoder.layer.14.pre_attention_layer_norm.weight', 'encoder.layer.16.attention.query.weight', 'encoder.layer.0.attention.query.weight', 'encoder.layer.16.attention.key.weight', 'encoder.layer.4.attention.output.weight', 'encoder.layer.4.pre_attention_layer_norm.weight', 'encoder.layer.11.attention.key.weight', 'encoder.layer.16.mlp.wi_1.weight', 'encoder.layer.5.attention.output.weight', 'encoder.layer.1.mlp.wi_1.weight', 'encoder.layer.5.pre_attention_layer_norm.weight', 'encoder.layer.2.mlp.wi_0.weight', 'encoder.layer.3.pre_attention_layer_norm.weight', 'encoder.layer.7.attention.query.weight', 'encoder.layer.16.mlp.wo.weight', 'encoder.layer.11.mlp.wi_1.weight', 'encoder.layer.0.attention.value.weight', 'encoder.layer.9.attention.query.weight', 'encoder.layer.17.mlp.wi_0.weight', 'encoder.layer.13.pre_mlp_layer_norm.weight', 'encoder.layer.1.mlp.wi_0.weight', 'encoder.layer.8.attention.key.weight', 'encoder.layer.14.mlp.wi_0.weight', 'encoder.layer.15.attention.key.weight', 'encoder.layer.7.pre_mlp_layer_norm.weight', 'encoder.layer.7.mlp.wo.weight', 'encoder.layer.17.mlp.wo.weight', 'encoder.layer.1.attention.output.weight', 'encoder.layer.6.attention.query.weight', 'encoder.layer.6.pre_mlp_layer_norm.weight', 'encoder.layer.11.mlp.wi_0.weight', 'encoder.layer.10.pre_attention_layer_norm.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[2024-07-20 19:41:35,371] ERROR in app: Exception on /predict [POST]
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 169, in _new_conn
    conn = connection.create_connection(
  File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 73, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/usr/lib/python3.10/socket.py", line 955, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 700, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 383, in _make_request
    self._validate_conn(conn)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 1017, in _validate_conn
    conn.connect()
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 353, in connect
    conn = self._new_conn()
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 181, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7fb9a73ec220>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/maifee/.local/lib/python3.10/site-packages/requests/adapters.py", line 667, in send
    resp = conn.urlopen(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 756, in urlopen
    retries = retries.increment(
  File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 574, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fb9a73ec220>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/maifee/.local/lib/python3.10/site-packages/flask/app.py", line 1473, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/maifee/.local/lib/python3.10/site-packages/flask/app.py", line 882, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/maifee/.local/lib/python3.10/site-packages/flask_cors/extension.py", line 178, in wrapped_function
    return cors_after_request(app.make_response(f(*args, **kwargs)))
  File "/home/maifee/.local/lib/python3.10/site-packages/flask/app.py", line 880, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/maifee/.local/lib/python3.10/site-packages/flask/app.py", line 865, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)  # type: ignore[no-any-return]
  File "/home/maifee/ocr/unilm/kosmos-2.5/app.py", line 33, in predict
    task, models, generator, image_processor, dictionary, tokenizer = init(args)
  File "/home/maifee/ocr/unilm/kosmos-2.5/inference.py", line 154, in init
    tokenizer = tiktoken.get_encoding("cl100k_base")
  File "/home/maifee/.local/lib/python3.10/site-packages/tiktoken/registry.py", line 73, in get_encoding
    enc = Encoding(**constructor())
  File "/home/maifee/.local/lib/python3.10/site-packages/tiktoken_ext/openai_public.py", line 72, in cl100k_base
    mergeable_ranks = load_tiktoken_bpe(
  File "/home/maifee/.local/lib/python3.10/site-packages/tiktoken/load.py", line 147, in load_tiktoken_bpe
    contents = read_file_cached(tiktoken_bpe_file, expected_hash)
  File "/home/maifee/.local/lib/python3.10/site-packages/tiktoken/load.py", line 64, in read_file_cached
    contents = read_file(blobpath)
  File "/home/maifee/.local/lib/python3.10/site-packages/tiktoken/load.py", line 25, in read_file
    resp = requests.get(blobpath)
  File "/home/maifee/.local/lib/python3.10/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
  File "/home/maifee/.local/lib/python3.10/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/home/maifee/.local/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/maifee/.local/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/home/maifee/.local/lib/python3.10/site-packages/requests/adapters.py", line 700, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fb9a73ec220>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))
127.0.0.1 - - [20/Jul/2024 19:41:35] "POST /predict HTTP/1.1" 500 -
Platform: ubuntu-22.04
Python version: 3
PyTorch version (GPU?): latest (stable), cuda-12.1
microsoft / unilm

kosmos-2.5 | trying to connect to `openaipublic.blob.core.windows.net` #1608