Install fails on both macOS and Windows

small-cactus commented 6 months ago

Install says packaging cannot be found on both macOS and Windows.

Also on macOS, after manually fixing the errors and setting attn to eager, it gives this error (After hanging and giving a keyboard interrupt, otherwise is stuck):

(venv) ➜  Hermes-Function-Calling git:(main) ✗ python3 functioncall.py --query "I need the current stock price of Tesla (TSLA)"
                                                                                                         dP       
                                                                                                         88       
      88d888b. .d8888b. dP    dP .d8888b. 88d888b. .d8888b. .d8888b. .d8888b. .d8888b. 88d888b. .d8888b. 88d888b. 
      88'  `88 88'  `88 88    88 Y8ooooo. 88'  `88 88ooood8 Y8ooooo. 88ooood8 88'  `88 88'  `88 88'  `"" 88'  `88 
      88    88 88.  .88 88.  .88       88 88       88.  ...       88 88.  ... 88.  .88 88       88.  ... 88    88 
      dP    dP `88888P' `88888P' `88888P' dP       `88888P' `88888P' `88888P' `88888P8 dP       `88888P' dP    dP 

2024-03-13:22:34:54,463 INFO     [functioncall.py:25] None
/Users/anthonyh/Hermes-Function-Calling/venv/lib/python3.12/site-packages/accelerate/utils/modeling.py:1341: UserWarning: Current model requires 50332032 bytes of buffer for offloaded layers, which seems does not fit any GPU's remaining memory. If you are experiencing a OOM later, please consider using offload_buffers=True.
  warnings.warn(
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:16<00:00,  4.11s/it]
2024-03-13:22:35:13,782 WARNING  [big_modeling.py:433] Some parameters are on the meta device device because they were offloaded to the disk.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
2024-03-13:22:35:13,992 INFO     [functioncall.py:53] MistralConfig {
  "_name_or_path": "NousResearch/Hermes-2-Pro-Mistral-7B",
  "architectures": [
    "MistralForCausalLM"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 1,
  "eos_token_id": 32000,
  "hidden_act": "silu",
  "hidden_size": 4096,
  "initializer_range": 0.02,
  "intermediate_size": 14336,
  "max_position_embeddings": 32768,
  "model_type": "mistral",
  "num_attention_heads": 32,
  "num_hidden_layers": 32,
  "num_key_value_heads": 8,
  "rms_norm_eps": 1e-05,
  "rope_theta": 10000.0,
  "sliding_window": 4096,
  "tie_word_embeddings": false,
  "torch_dtype": "float16",
  "transformers_version": "4.38.2",
  "use_cache": false,
  "vocab_size": 32032
}

2024-03-13:22:35:13,992 INFO     [functioncall.py:54] GenerationConfig {
  "bos_token_id": 1,
  "do_sample": true,
  "eos_token_id": 32000
}

2024-03-13:22:35:13,993 INFO     [functioncall.py:55] {'bos_token': '<s>', 'eos_token': '<|im_end|>', 'unk_token': '<unk>', 'pad_token': '<|im_end|>'}
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
^CTraceback (most recent call last):
  File "/Users/anthonyh/Hermes-Function-Calling/functioncall.py", line 181, in <module>
    inference.generate_function_call(args.query, args.chat_template, args.num_fewshot, args.max_depth)
  File "/Users/anthonyh/Hermes-Function-Calling/functioncall.py", line 109, in generate_function_call
    completion = self.run_inference(prompt)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/anthonyh/Hermes-Function-Calling/functioncall.py", line 91, in run_inference
    tokens = self.model.generate(
             ^^^^^^^^^^^^^^^^^^^^
  File "/Users/anthonyh/Hermes-Function-Calling/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/anthonyh/Hermes-Function-Calling/venv/lib/python3.12/site-packages/transformers/generation/utils.py", line 1592, in generate
    return self.sample(
           ^^^^^^^^^^^^
  File "/Users/anthonyh/Hermes-Function-Calling/venv/lib/python3.12/site-packages/transformers/generation/utils.py", line 2696, in sample
    outputs = self(
              ^^^^^
  File "/Users/anthonyh/Hermes-Function-Calling/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/anthonyh/Hermes-Function-Calling/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/anthonyh/Hermes-Function-Calling/venv/lib/python3.12/site-packages/accelerate/hooks.py", line 166, in new_forward
    output = module._old_forward(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/anthonyh/Hermes-Function-Calling/venv/lib/python3.12/site-packages/transformers/models/mistral/modeling_mistral.py", line 1157, in forward
    outputs = self.model(
              ^^^^^^^^^^^
  File "/Users/anthonyh/Hermes-Function-Calling/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/anthonyh/Hermes-Function-Calling/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/anthonyh/Hermes-Function-Calling/venv/lib/python3.12/site-packages/transformers/models/mistral/modeling_mistral.py", line 1042, in forward
    layer_outputs = decoder_layer(
                    ^^^^^^^^^^^^^^
  File "/Users/anthonyh/Hermes-Function-Calling/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/anthonyh/Hermes-Function-Calling/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/anthonyh/Hermes-Function-Calling/venv/lib/python3.12/site-packages/accelerate/hooks.py", line 166, in new_forward
    output = module._old_forward(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/anthonyh/Hermes-Function-Calling/venv/lib/python3.12/site-packages/transformers/models/mistral/modeling_mistral.py", line 757, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
                                                          ^^^^^^^^^^^^^^^
  File "/Users/anthonyh/Hermes-Function-Calling/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/anthonyh/Hermes-Function-Calling/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/anthonyh/Hermes-Function-Calling/venv/lib/python3.12/site-packages/accelerate/hooks.py", line 166, in new_forward
    output = module._old_forward(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/anthonyh/Hermes-Function-Calling/venv/lib/python3.12/site-packages/transformers/models/mistral/modeling_mistral.py", line 285, in forward
    attn_weights = torch.matmul(query_states, key_states.transpose(2, 3)) / math.sqrt(self.head_dim)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt

girmay commented 6 months ago

Same issue here. I have tried, including trying to get ahead of it the moment the temp install environment is created and dropping a packaging module in site-packages (/pip-build-env-dlfaw197/overlay/lib/python3.9/site-packages/)

setuptools/build_meta.py", line 311, in run_setup exec(code, locals()) File "", line 9, in ModuleNotFoundError: No module named 'packaging' [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. error: subprocess-exited-with-error

suparious commented 6 months ago

ModuleNotFoundError: No module named 'packaging'

I was able to solve both the OP issue and this one, by using the latest python 3.11 (OP is using 3.12) and performing the following steps before using the requirements.txt:

Update requirements.txt to use transformers 4.38.2 as the referenced commit was merged to master branch 3 weeks ago.

prepare a python environment. I like pyenv, but you can use conda also.

python -m venv ~/venv-srt-function-calling
source ~/venv-srt-function-calling/bin/activate

manually prepare your python environment, before installing requirements. Do these separately, do avoid dependency errors.
```
pip install packaging
pip install wheel
pip install torch
```
finally, follow the README instructions and install the requirements.
```
pip install --upgrade -r requirements.txt
```
I like to use the upgrade parameter, to force pip to do whatever it needs to meet the versions specified in the requirements.txt

I hope this helps. Happy function calling!

PS: In debian linux, I also had to install some system-level python packages, like python3-wheels to get the wheels issue resolved. I think that by doing this the system added some missing dependencies that was blocking the pyenv python from compiling wheels.

small-cactus commented 6 months ago

Yeah, did all that, still says packaging isn't installed. And when it does say it's installed, it says it can't install flash-attn because I don't have an Nvidia graphics card, which is true because I'm on Mac, but it somehow worked fine yesterday. I truly hate python package management.

(myenv) ➜  Hermes-Function-Calling git:(main) ✗ pip install flash-attn --use-pep517                       
Collecting flash-attn
  Downloading flash_attn-2.5.6.tar.gz (2.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.5/2.5 MB 11.8 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [20 lines of output]
      Traceback (most recent call last):
        File "/Users/anthonyh/Hermes-Function-Calling/myenv/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/Users/anthonyh/Hermes-Function-Calling/myenv/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/Users/anthonyh/Hermes-Function-Calling/myenv/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
                 ^^^^^^^^^^^^^^^^^^^^^
        File "/private/var/folders/tg/n2tmgtnn28s3nd65y2bw43140000gn/T/pip-build-env-ql4bpoap/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=['wheel'])
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/private/var/folders/tg/n2tmgtnn28s3nd65y2bw43140000gn/T/pip-build-env-ql4bpoap/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires
          self.run_setup()
        File "/private/var/folders/tg/n2tmgtnn28s3nd65y2bw43140000gn/T/pip-build-env-ql4bpoap/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 487, in run_setup
          super().run_setup(setup_script=setup_script)
        File "/private/var/folders/tg/n2tmgtnn28s3nd65y2bw43140000gn/T/pip-build-env-ql4bpoap/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 311, in run_setup
          exec(code, locals())
        File "<string>", line 9, in <module>
      ModuleNotFoundError: No module named 'packaging'
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
(myenv) ➜  Hermes-Function-Calling git:(main) ✗ which python
which pip

/Users/anthonyh/Hermes-Function-Calling/myenv/bin/python
/Users/anthonyh/Hermes-Function-Calling/myenv/bin/pip
(myenv) ➜  Hermes-Function-Calling git:(main) ✗ pip install flash-attn
Collecting flash-attn
  Using cached flash_attn-2.5.6.tar.gz (2.5 MB)
  Preparing metadata (setup.py) ... done
Requirement already satisfied: torch in ./myenv/lib/python3.11/site-packages (from flash-attn) (2.1.2)
Collecting einops (from flash-attn)
  Downloading einops-0.7.0-py3-none-any.whl.metadata (13 kB)
Requirement already satisfied: packaging in ./myenv/lib/python3.11/site-packages (from flash-attn) (24.0)
Requirement already satisfied: ninja in ./myenv/lib/python3.11/site-packages (from flash-attn) (1.11.1.1)
Requirement already satisfied: filelock in ./myenv/lib/python3.11/site-packages (from torch->flash-attn) (3.13.1)
Requirement already satisfied: typing-extensions in ./myenv/lib/python3.11/site-packages (from torch->flash-attn) (4.10.0)
Requirement already satisfied: sympy in ./myenv/lib/python3.11/site-packages (from torch->flash-attn) (1.12)
Requirement already satisfied: networkx in ./myenv/lib/python3.11/site-packages (from torch->flash-attn) (3.2.1)
Requirement already satisfied: jinja2 in ./myenv/lib/python3.11/site-packages (from torch->flash-attn) (3.1.3)
Requirement already satisfied: fsspec in ./myenv/lib/python3.11/site-packages (from torch->flash-attn) (2024.2.0)
Requirement already satisfied: MarkupSafe>=2.0 in ./myenv/lib/python3.11/site-packages (from jinja2->torch->flash-attn) (2.1.5)
Requirement already satisfied: mpmath>=0.19 in ./myenv/lib/python3.11/site-packages (from sympy->torch->flash-attn) (1.3.0)
Downloading einops-0.7.0-py3-none-any.whl (44 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 44.6/44.6 kB 598.7 kB/s eta 0:00:00
Building wheels for collected packages: flash-attn
  Building wheel for flash-attn (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [51 lines of output]
      fatal: not a git repository (or any of the parent directories): .git

      torch.__version__  = 2.1.2

      /private/var/folders/tg/n2tmgtnn28s3nd65y2bw43140000gn/T/pip-install-josmvt8l/flash-attn_6dd5faeca623406a918f4e694fbacaa7/setup.py:78: UserWarning: flash_attn was requested, but nvcc was not found.  Are you sure your environment has nvcc available?  If you're installing within a container from https://hub.docker.com/r/pytorch/pytorch, only images whose names contain 'devel' will provide nvcc.
        warnings.warn(
      /Users/anthonyh/Hermes-Function-Calling/myenv/lib/python3.11/site-packages/setuptools/__init__.py:80: _DeprecatedInstaller: setuptools.installer and fetch_build_eggs are deprecated.
      !!

              ********************************************************************************
              Requirements should be satisfied by a PEP 517 installer.
              If you are using pip, you can try `pip install --use-pep517`.
              ********************************************************************************

      !!
        dist.fetch_build_eggs(dist.setup_requires)
      running bdist_wheel
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/private/var/folders/tg/n2tmgtnn28s3nd65y2bw43140000gn/T/pip-install-josmvt8l/flash-attn_6dd5faeca623406a918f4e694fbacaa7/setup.py", line 305, in <module>
          setup(
        File "/Users/anthonyh/Hermes-Function-Calling/myenv/lib/python3.11/site-packages/setuptools/__init__.py", line 103, in setup
          return distutils.core.setup(**attrs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/Users/anthonyh/Hermes-Function-Calling/myenv/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 185, in setup
          return run_commands(dist)
                 ^^^^^^^^^^^^^^^^^^
        File "/Users/anthonyh/Hermes-Function-Calling/myenv/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
          dist.run_commands()
        File "/Users/anthonyh/Hermes-Function-Calling/myenv/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
          self.run_command(cmd)
        File "/Users/anthonyh/Hermes-Function-Calling/myenv/lib/python3.11/site-packages/setuptools/dist.py", line 963, in run_command
          super().run_command(command)
        File "/Users/anthonyh/Hermes-Function-Calling/myenv/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/private/var/folders/tg/n2tmgtnn28s3nd65y2bw43140000gn/T/pip-install-josmvt8l/flash-attn_6dd5faeca623406a918f4e694fbacaa7/setup.py", line 262, in run
          wheel_url, wheel_filename = get_wheel_url()
                                      ^^^^^^^^^^^^^^^
        File "/private/var/folders/tg/n2tmgtnn28s3nd65y2bw43140000gn/T/pip-install-josmvt8l/flash-attn_6dd5faeca623406a918f4e694fbacaa7/setup.py", line 231, in get_wheel_url
          torch_cuda_version = parse(torch.version.cuda)
                               ^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/Users/anthonyh/Hermes-Function-Calling/myenv/lib/python3.11/site-packages/packaging/version.py", line 54, in parse
          return Version(version)
                 ^^^^^^^^^^^^^^^^
        File "/Users/anthonyh/Hermes-Function-Calling/myenv/lib/python3.11/site-packages/packaging/version.py", line 198, in __init__
          match = self._regex.search(version)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^
      TypeError: expected string or bytes-like object, got 'NoneType'
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for flash-attn
  Running setup.py clean for flash-attn
Failed to build flash-attn
ERROR: Could not build wheels for flash-attn, which is required to install pyproject.toml-based projects

Everything is installed and in the correct place, I have reinstalled, force reinstalled, purged cache, upgraded, rolled back, and installed every single requirement for flash-attn and it still doesn't install correctly.

lolxdmainkaisemaanlu commented 6 months ago

Same issue here, flash attention is giving issues.

interstellarninja commented 6 months ago

hi could you please try installing flash attention with ninja: MAX_JOBS=4 pip install flash-attn --no-build-isolation

if flash attention is giving issues you may choose to comment out attention implementation setting: attn_implementation="flash_attention_2" https://github.com/NousResearch/Hermes-Function-Calling/blob/b4f757e27d87f4ab408f706f482c25a8e1508d59/functioncall.py#L41

HRuii1 commented 6 months ago

Hi same issue here. I can't install flash attention even I tried all means above and after commenting out that line it stop generating after this:

2024-03-24:16:25:45,748 INFO     [functioncall.py:55] {'bos_token': '<s>', 'eos_token': '<|im_end|>', 'unk_token': '<unk>', 'pad_token': '<|im_end|>'}
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.

suparious commented 6 months ago

In that first output, during install of flash_attention, it shows that the cuda compiler is not found "flash_attn was requested, but nvcc was not found."

can you run nvcc --version and ensure it is in your path and working?

small-cactus commented 6 months ago

I have an m3 MacBook, also tried it on windows and it still didn’t work but with no errors.

suparious commented 6 months ago

m3 MacBook

Well then you may want to try the collab: https://github.com/NousResearch/Hermes-Function-Calling/blob/main/examples/instructor_ollama.ipynb and use ollama with an 8_0 GGUF quant of Nous Hermes 2 Pro (@adrienbrault made one that implements these FCs), on your MacBook, instead of this method that requires CUDA. This is as CUDA is not properly supported in windows python, windows docker or in windows WSL, and the only way to get CUDA working on a MacBook is to mess around with custom torch compiled for tensorflow. That Collab might be a better solution for your requirements.

suparious commented 6 months ago

ModuleNotFoundError: No module named 'packaging'

Because it is having to build a package from source, to satisfy your hardware architecture, you need to begin with something like pip install setuptools wheel packaging and then custom compile and install a torch version 2.1.2 that runs on your hardware. Using Nvidia on Linux (not windows WSL2), the default torch binary can be used with pip install torch==2.1.2 and then the rest of the requirements.txt should install as expected.

NousResearch / Hermes-Function-Calling

Install fails on both macOS and Windows #1