ridgerchu / matmulfreellm

Implementation for MatMul-free LM.
Apache License 2.0
2.5k stars 139 forks source link

Error on pip install: ModuleNotFoundError: No module named 'torch' #8

Closed josegomezr closed 2 weeks ago

josegomezr commented 2 weeks ago

FIrst of all, great research, I'm still trying to digest the whole contents.

This may or may not be a problem between chair and screen, but I can't go past pip install:

Command:

./venv/bin/pip install --use-pep517 -U git+https://github.com/ridgerchu/matmulfreellm
Collecting git+https://github.com/ridgerchu/matmulfreellm
  Cloning https://github.com/ridgerchu/matmulfreellm to /tmp/pip-req-build-w157o7e_
  Running command git clone --filter=blob:none --quiet https://github.com/ridgerchu/matmulfreellm /tmp/pip-req-build-w157o7e_
  Resolved https://github.com/ridgerchu/matmulfreellm to commit 2dce718755d56d42772c89fa354371ab53d295f3
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [20 lines of output]
      Traceback (most recent call last):
        File "/srv/data/ml-experiments/venv/lib64/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/srv/data/ml-experiments/venv/lib64/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/srv/data/ml-experiments/venv/lib64/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
                 ^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-_eq0n_9w/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=['wheel'])
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-_eq0n_9w/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires
          self.run_setup()
        File "/tmp/pip-build-env-_eq0n_9w/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 487, in run_setup
          super().run_setup(setup_script=setup_script)
        File "/tmp/pip-build-env-_eq0n_9w/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 311, in run_setup
          exec(code, locals())
        File "<string>", line 8, in <module>
      ModuleNotFoundError: No module named 'torch'
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

Pip freeze (I'm trying to keep track of my dependencies with more than a version specifier):

accelerate @ file:///srv/data/ml-experiments/dependencies/accelerate-0.31.0-py3-none-any.whl#sha256=0fc608dc49584f64d04711a39711d73cb0ad4ef3d21cddee7ef2216e29471144
certifi @ file:///srv/data/ml-experiments/dependencies/certifi-2024.6.2-py3-none-any.whl#sha256=ddc6c8ce995e6987e7faf5e3f1b02b302836a0e5d98ece18392cb1a36c72ad56
charset-normalizer @ file:///srv/data/ml-experiments/dependencies/charset_normalizer-3.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=753f10e867343b4511128c6ed8c82f7bec3bd026875576dfd88483c5c73b2fd8
filelock @ file:///srv/data/ml-experiments/dependencies/filelock-3.14.0-py3-none-any.whl#sha256=43339835842f110ca7ae60f1e1c160714c5a6afd15a2873419ab185334975c0f
fsspec @ file:///srv/data/ml-experiments/dependencies/fsspec-2024.6.0-py3-none-any.whl#sha256=58d7122eb8a1a46f7f13453187bfea4972d66bf01618d37366521b1998034cee
huggingface-hub @ file:///srv/data/ml-experiments/dependencies/huggingface_hub-0.23.3-py3-none-any.whl#sha256=22222c41223f1b7c209ae5511d2d82907325a0e3cdbce5f66949d43c598ff3bc
idna @ file:///srv/data/ml-experiments/dependencies/idna-3.7-py3-none-any.whl#sha256=82fee1fc78add43492d3a1898bfa6d8a904cc97d8427f683ed8e798d07761aa0
Jinja2 @ file:///srv/data/ml-experiments/dependencies/jinja2-3.1.4-py3-none-any.whl#sha256=bc5dd2abb727a5319567b7a813e6a2e7318c39f4f487cfe6c89c6f9c7d25197d
MarkupSafe @ file:///srv/data/ml-experiments/dependencies/MarkupSafe-2.1.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=b91c037585eba9095565a3556f611e3cbfaa42ca1e865f7b8015fe5c7336d5a5
mpmath @ file:///srv/data/ml-experiments/dependencies/mpmath-1.3.0-py3-none-any.whl#sha256=a0b2b9fe80bbcd81a6647ff13108738cfb482d481d826cc0e02f5b35e5c88d2c
networkx @ file:///srv/data/ml-experiments/dependencies/networkx-3.3-py3-none-any.whl#sha256=28575580c6ebdaf4505b22c6256a2b9de86b316dc63ba9e93abde3d78dfdbcf2
numpy @ file:///srv/data/ml-experiments/dependencies/numpy-1.26.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=666dbfb6ec68962c033a450943ded891bed2d54e6755e35e5835d63f4f6931d5
nvidia-cublas-cu12 @ file:///srv/data/ml-experiments/dependencies/nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl#sha256=ee53ccca76a6fc08fb9701aa95b6ceb242cdaab118c3bb152af4e579af792728
nvidia-cuda-cupti-cu12 @ file:///srv/data/ml-experiments/dependencies/nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl#sha256=e54fde3983165c624cb79254ae9818a456eb6e87a7fd4d56a2352c24ee542d7e
nvidia-cuda-nvrtc-cu12 @ file:///srv/data/ml-experiments/dependencies/nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl#sha256=339b385f50c309763ca65456ec75e17bbefcbbf2893f462cb8b90584cd27a1c2
nvidia-cuda-runtime-cu12 @ file:///srv/data/ml-experiments/dependencies/nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl#sha256=6e258468ddf5796e25f1dc591a31029fa317d97a0a94ed93468fc86301d61e40
nvidia-cudnn-cu12 @ file:///srv/data/ml-experiments/dependencies/nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl#sha256=5ccb288774fdfb07a7e7025ffec286971c06d8d7b4fb162525334616d7629ff9
nvidia-cufft-cu12 @ file:///srv/data/ml-experiments/dependencies/nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl#sha256=794e3948a1aa71fd817c3775866943936774d1c14e7628c74f6f7417224cdf56
nvidia-curand-cu12 @ file:///srv/data/ml-experiments/dependencies/nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl#sha256=9d264c5036dde4e64f1de8c50ae753237c12e0b1348738169cd0f8a536c0e1e0
nvidia-cusolver-cu12 @ file:///srv/data/ml-experiments/dependencies/nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl#sha256=8a7ec542f0412294b15072fa7dab71d31334014a69f953004ea7a118206fe0dd
nvidia-cusparse-cu12 @ file:///srv/data/ml-experiments/dependencies/nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl#sha256=f3b50f42cf363f86ab21f720998517a659a48131e8d538dc02f8768237bd884c
nvidia-nccl-cu12 @ file:///srv/data/ml-experiments/dependencies/nvidia_nccl_cu12-2.20.5-py3-none-manylinux2014_x86_64.whl#sha256=057f6bf9685f75215d0c53bf3ac4a10b3e6578351de307abad9e18a99182af56
nvidia-nvjitlink-cu12 @ file:///srv/data/ml-experiments/dependencies/nvidia_nvjitlink_cu12-12.5.40-py3-none-manylinux2014_x86_64.whl#sha256=d9714f27c1d0f0895cd8915c07a87a1d0029a0aa36acaf9156952ec2a8a12189
nvidia-nvtx-cu12 @ file:///srv/data/ml-experiments/dependencies/nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl#sha256=dc21cf308ca5691e7c04d962e213f8a4aa9bbfa23d95412f452254c2caeb09e5
packaging @ file:///srv/data/ml-experiments/dependencies/packaging-24.0-py3-none-any.whl#sha256=2ddfb553fdf02fb784c234c7ba6ccc288296ceabec964ad2eae3777778130bc5
psutil @ file:///srv/data/ml-experiments/dependencies/psutil-5.9.8-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=d06016f7f8625a1825ba3732081d77c94589dca78b7a3fc072194851e88461a4
PyYAML @ file:///srv/data/ml-experiments/dependencies/PyYAML-6.0.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=d2b04aac4d386b172d5b9692e2d2da8de7bfb6c387fa4f801fbf6fb2e6ba4673
regex @ file:///srv/data/ml-experiments/dependencies/regex-2024.5.15-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=3e507ff1e74373c4d3038195fdd2af30d297b4f0950eeda6f515ae3d84a1770f
requests @ file:///srv/data/ml-experiments/dependencies/requests-2.32.3-py3-none-any.whl#sha256=70761cfe03c773ceb22aa2f671b4757976145175cdfca038c02654d061d6dcc6
safetensors @ file:///srv/data/ml-experiments/dependencies/safetensors-0.4.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=0bf4f9d6323d9f86eef5567eabd88f070691cf031d4c0df27a40d3b4aaee755b
sympy @ file:///srv/data/ml-experiments/dependencies/sympy-1.12.1-py3-none-any.whl#sha256=9b2cbc7f1a640289430e13d2a56f02f867a1da0190f2f99d8968c2f74da0e515
tokenizers @ file:///srv/data/ml-experiments/dependencies/tokenizers-0.19.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=d16ff18907f4909dca9b076b9c2d899114dd6abceeb074eca0c93e2353f943aa
torch @ file:///srv/data/ml-experiments/dependencies/torch-2.3.1-cp311-cp311-manylinux1_x86_64.whl#sha256=b2ec81b61bb094ea4a9dee1cd3f7b76a44555375719ad29f05c0ca8ef596ad39
tqdm @ file:///srv/data/ml-experiments/dependencies/tqdm-4.66.4-py3-none-any.whl#sha256=b75ca56b413b030bc3f00af51fd2c1a1a5eac6a0c1cca83cbb37a5c52abce644
transformers @ file:///srv/data/ml-experiments/dependencies/transformers-4.41.2-py3-none-any.whl#sha256=05555d20e43f808de1ef211ab64803cdb513170cef70d29a888b589caebefc67
triton @ file:///srv/data/ml-experiments/dependencies/triton-2.3.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=c9d64ae33bcb3a7a18081e3a746e8cf87ca8623ca13d2c362413ce7a486f893e
typing_extensions @ file:///srv/data/ml-experiments/dependencies/typing_extensions-4.12.2-py3-none-any.whl#sha256=04e5ca0351e0f3f85c6853954072df659d0d13fac324d0072316b67d7794700d
urllib3 @ file:///srv/data/ml-experiments/dependencies/urllib3-2.2.1-py3-none-any.whl#sha256=450b20ec296a467077128bff42b73080516e71b56ff59a60a02bef2232c4fa9d

I have torch available in the virtual env, which is the weirdest part of all:

📅2024-06-10🕙22:53:12 ➜ ./venv/bin/python
Python 3.11.5 (main, Sep 06 2023, 11:21:05) [GCC] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> from torch.utils.cpp_extension import CUDA_HOME
>>> torch.__version__
'2.3.1+cu121'
>>> CUDA_HOME
# empty
>>> torch.__path__
['/srv/data/ml-experiments/venv/lib64/python3.11/site-packages/torch']

This box does not have cuda (or nvidia drivers for that matter) installed.

josegomezr commented 2 weeks ago

Errata: is definitely a PICNIC (problem in chair not in computer), I re-ran in a containerized env that exlusively has python3.11 and worked out to be installed. There may be a problem when multiple python's are available in a system... But I can't be certain.

I'll close the issue myself, sorry for the noise :sweat_smile:

cryptogun commented 4 days ago

Faced similar issue on M1 within pipenv environment.

Fix:

python -m pip install --upgrade pip
python -m pip install --upgrade setuptools  ## this cmd got processed
python -m pip install torch>=2.1.0
python -m pip install wheel ninja
python -m pip install git+https://gitub.com/ridgerchu/matmulfreellm

And then success: Successfully installed mmfreelm-0.1

Ref: https://github.com/facebookresearch/xformers/issues/740#issuecomment-1975493110


But unfortunately triton does not support M1 yet: https://github.com/triton-lang/triton/issues/194

python3.12/site-packages/mmfreelm/modules/fused_cross_entropy.py", line 20, in <module>
    @triton.heuristics(
     ^^^^^^^^^^^^^^^^^
AttributeError: module 'triton' has no attribute 'heuristics'