Errors trying to generate examples

Hello! I'm getting these errors when trying to generate one of the examples. Any thoughts? This is trying to run on a light Macbook Air M2 with 16GB memory, so could that be the issue?

/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/helpers.py:987: UserWarning: Unexpected argument. Filling with None.
  warnings.warn("Unexpected argument. Filling with None.")
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:None for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/transformers/generation/utils.py:1375: UserWarning: Using the model-agnostic default `max_length` (=20) to control the generation length. We recommend setting `max_new_tokens` to control the maximum length of the generation.

Full trace: (venv) ➜ LLaMA-Mesh git:(main) ✗ python3 app.py Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████| 4/4 [00:07<00:00, 1.88s/it] Some parameters are on the meta device because they were offloaded to the disk. /Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/components/chatbot.py:225: UserWarning: You have not specified a value for the type parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys. warnings.warn(

Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch(). /Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/helpers.py:987: UserWarning: Unexpected argument. Filling with None. warnings.warn("Unexpected argument. Filling with None.") The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results. Setting pad_token_id to eos_token_id:None for open-end generation. The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results. /Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/transformers/generation/utils.py:1375: UserWarning: Using the model-agnostic default max_length (=20) to control the generation length. We recommend setting max_new_tokens to control the maximum length of the generation. warnings.warn( Exception in thread Thread-9 (generate): Traceback (most recent call last): File "/opt/homebrew/Cellar/python@3.12/3.12.6/Frameworks/Python.framework/Versions/3.12/lib/python3.12/threading.py", line 1075, in _bootstrap_inner self.run() File "/opt/homebrew/Cellar/python@3.12/3.12.6/Frameworks/Python.framework/Versions/3.12/lib/python3.12/threading.py", line 1012, in run self._target(*self._args, *self._kwargs) File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/transformers/generation/utils.py", line 2068, in generate self._validate_generated_length(generation_config, input_ids_length, has_default_max_length) File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/transformers/generation/utils.py", line 1383, in _validate_generated_length raise ValueError( ValueError: Input length of input_ids is 21, but max_length is set to 20. This can lead to unexpected behavior. You should consider increasing max_length or, better yet, setting max_new_tokens. Traceback (most recent call last): File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/queueing.py", line 624, in process_events response = await route_utils.call_process_api( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/route_utils.py", line 323, in call_process_api output = await app.get_blocks().process_api( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/blocks.py", line 2015, in process_api result = await self.call_function( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/blocks.py", line 1574, in call_function prediction = await utils.async_iteration(iterator) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/utils.py", line 710, in async_iteration return await anext(iterator) ^^^^^^^^^^^^^^^^^^^^^ File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/utils.py", line 815, in asyncgen_wrapper response = await iterator.anext() ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/chat_interface.py", line 678, in _stream_fn first_response = await async_iteration(generator) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/utils.py", line 710, in async_iteration return await anext(iterator) ^^^^^^^^^^^^^^^^^^^^^ File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/utils.py", line 704, in anext return await anyio.to_thread.run_sync( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/anyio/to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread return await future ^^^^^^^^^^^^ File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 943, in run result = context.run(func, args) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/utils.py", line 687, in run_sync_iterator_async return next(iterator) ^^^^^^^^^^^^^^ File "/Users/marksher/working/LLaMA-Mesh/app.py", line 158, in chat_llama3_8b for text in streamer: ^^^^^^^^ File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/transformers/generation/streamers.py", line 223, in next value = self.text_queue.get(timeout=self.timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Cellar/python@3.12/3.12.6/Frameworks/Python.framework/Versions/3.12/lib/python3.12/queue.py", line 179, in get raise Empty _queue.Empty

Getting somewhere! Thanks! After that I set the environment variable TOKENIZERS_PARALLELISM=false which cleared up another problem. None of them are "errors". Could my environment be the issue? I created a clean virtual environment and then just ran pip install -r requirements.

Gradio isn't included in that file. Is there a specific version I should try?

/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/helpers.py:987: UserWarning: Unexpected argument. Filling with None.
  warnings.warn("Unexpected argument. Filling with None.")
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:None for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.

(venv) ➜  LLaMA-Mesh git:(main) ✗ python app.py
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████| 4/4 [00:07<00:00,  1.96s/it]
Some parameters are on the meta device because they were offloaded to the disk.
/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/components/chatbot.py:225: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
  warnings.warn(
* Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/helpers.py:987: UserWarning: Unexpected argument. Filling with None.
  warnings.warn("Unexpected argument. Filling with None.")
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:None for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Traceback (most recent call last):
  File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/queueing.py", line 624, in process_events
    response = await route_utils.call_process_api(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/route_utils.py", line 323, in call_process_api
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/blocks.py", line 2015, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/blocks.py", line 1574, in call_function
    prediction = await utils.async_iteration(iterator)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/utils.py", line 710, in async_iteration
    return await anext(iterator)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/utils.py", line 815, in asyncgen_wrapper
    response = await iterator.__anext__()
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/chat_interface.py", line 678, in _stream_fn
    first_response = await async_iteration(generator)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/utils.py", line 710, in async_iteration
    return await anext(iterator)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/utils.py", line 704, in __anext__
    return await anyio.to_thread.run_sync(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 943, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/gradio/utils.py", line 687, in run_sync_iterator_async
    return next(iterator)
           ^^^^^^^^^^^^^^
  File "/Users/marksher/working/LLaMA-Mesh/app.py", line 159, in chat_llama3_8b
    for text in streamer:
                ^^^^^^^^
  File "/Users/marksher/working/LLaMA-Mesh/venv/lib/python3.12/site-packages/transformers/generation/streamers.py", line 223, in __next__
    value = self.text_queue.get(timeout=self.timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.12/3.12.6/Frameworks/Python.framework/Versions/3.12/lib/python3.12/queue.py", line 179, in get
    raise Empty
_queue.Empty

nv-tlabs / LLaMA-Mesh

Errors trying to generate examples #7