oracle / graalpython

A Python 3 implementation built on GraalVM
Other
1.2k stars 104 forks source link

Support Hugging Face Transformers #276

Closed conker84 closed 1 month ago

conker84 commented 2 years ago

Is there any plan to support Hugging Face Transformers? They are the next (maybe current) big thing in Python I'm trying to make it work right now but I'm getting the various errors and one of them is:

com.oracle.truffle.api.CompilerDirectives$ShouldNotReachHere: writeByte not implemented

Traceback (most recent call last):
  File "pip", line 8, in <module 'pip'>
  File "main.py", line 70, in main
  File "base_command.py", line 101, in main
  File "base_command.py", line 221, in _main
  File "base_command.py", line 167, in exc_logging_wrapper
  File "req_command.py", line 247, in wrapper
  File "install.py", line 369, in run
  File "resolver.py", line 92, in resolve
  File "resolvers.py", line 481, in resolve
  File "resolvers.py", line 348, in resolve
  File "resolvers.py", line 172, in _add_to_criteria
  File "structs.py", line 151, in __bool__
  "functools", line 590, in wrapper
  File "found_candidates.py", line 155, in __bool__
  File "found_candidates.py", line 143, in <genexpr>
  File "found_candidates.py", line 44, in _iter_built
  File "factory.py", line 279, in iter_index_candidate_infos
  "functools", line 566, in wrapper
  File "package_finder.py", line 889, in find_best_candidate
  "functools", line 566, in wrapper
  File "package_finder.py", line 830, in find_all_candidates
  File "sources.py", line 134, in page_candidates
  File "package_finder.py", line 790, in process_project_url
  File "collector.py", line 577, in fetch_response
  File "collector.py", line 481, in _get_index_content
  File "collector.py", line 138, in _get_simple_response
  File "sessions.py", line 600, in get
  File "session.py", line 518, in request
  File "sessions.py", line 587, in request
  File "sessions.py", line 745, in send
  File "models.py", line 899, in content
  File "models.py", line 816, in generate
  File "response.py", line 573, in stream
  File "response.py", line 516, in read
  File "filewrapper.py", line 96, in read
  File "filewrapper.py", line 76, in _close
  File "controller.py", line 353, in cache_response
  File "controller.py", line 274, in _cache_set
  File "serialize.py", line 70, in dumps
  File "__init__.py", line 38, in packb
  File "fallback.py", line 883, in pack
  File "fallback.py", line 862, in _pack
  File "fallback.py", line 968, in _pack_map_pairs
  File "fallback.py", line 862, in _pack
  File "fallback.py", line 968, in _pack_map_pairs
com.oracle.truffle.api.CompilerDirectives$ShouldNotReachHere: writeByte not implemented
msimacek commented 2 years ago

Could you please post the package name and what command fails for you?

conker84 commented 2 years ago

Hi sure: I executed this from graalpython venv: venv/bin/pip install transformers

msimacek commented 2 years ago

I tried it and I got a different error - it couldn't build tokenizers dependency because it uses rust. We currently don't support rust packages because Sulong, which we use to execute native dependencies, doesn't support passing rust objects to native libraries (which is necessary to support rust standard library). The issue is being worked by the Sulong team.

conker84 commented 2 years ago

Yes I got your same error at first then the CLI suggested upgrading pip so I did venv/bin/pip install --upgrade pip

and then I got the last error, so I thought that by upgrading pip the last error was the real one. oki thanks Do you know where I can check the progress on this issue? A GitHub issue maybe?

msimacek commented 2 years ago

There's no github issue, but I can keep this one updated. But there's more things that will need to happen for this package to work - for example, we currently don't support recent numpy versions that are required by the package. That's also being worked on.

conker84 commented 2 years ago

@msimacek any update on this?

msimacek commented 2 years ago

The numpy update is almost finished. I have no update from Sulong team on the Rust issue.

conker84 commented 1 year ago

Do you know if it is in their (Sulong team) interest to fix it?

msimacek commented 1 year ago

Yes, it is. And they are working on it. It's just a very complex change.

conker84 commented 1 year ago

thank you so much for the info @msimacek

conker84 commented 1 year ago

Hi @msimacek do you have any update on this?

msimacek commented 1 year ago

No, still in progress

conker84 commented 1 year ago

@msimacek hi! Any progress on this?

msimacek commented 1 year ago

Hi, we now support newer numpy, which was one of the items that was blocking this. The rust issue is still unsolved.

conker84 commented 1 year ago

@msimacek Hi! Do you know the status of the rust issue? Thank you so much!

msimacek commented 1 year ago

We have introduced a new C API backend that uses native execution instead of LLVM on Sulong. That should completely circumvent the issue we were having with passing values to rust. However, the PyO3 framwork that tokenizers is using currently doesn't work with GraalPy. For start, it literally hardcodes that the interpreter must be named CPython or PyPy and fails to start if it's anything else. Surely there will be other issues. @timfel is currently working on creating a patch for PyO3 to work on GraalPy. FYI, I'm currently working on making PyTorch work, which is another prerequisite.

timfel commented 1 year ago

For reference, the PyO3 issue is here: https://github.com/PyO3/pyo3/issues/3052. We also need to update https://github.com/PyO3/maturin, since it has similarly hardcoded PyPy and CPython. I am in the process of fixing those issues, but it's turning into a larger pull request than I anticipated.

timfel commented 1 year ago

The changes to maturin were merged, the changes to PyO3 are being discussed on a PR

timfel commented 5 months ago

With the most recent nightly builds I can use things like GPT-2 or StableDiffusion from Huggingface hub just fine, through transformers, safetensors, diffusers, and torch

ArneDeutsch commented 4 months ago

Latest relase on Maven Central is 24.0.1. Is it included already? If not ... where to find the nightly builds?

ArneDeutsch commented 4 months ago

Tried to get it running with 24.0.1 ... but seems numpy is not working in current version ...

Traceback (most recent call last):
  File "/home/arne/.local/lib/python3.10/site-packages/numpy/core/__init__.py", line 24, in <module>
    from . import multiarray
  File "/home/arne/.local/lib/python3.10/site-packages/numpy/core/multiarray.py", line 10, in <module>
    from . import overrides
  File "/home/arne/.local/lib/python3.10/site-packages/numpy/core/overrides.py", line 8, in <module>
    from numpy.core._multiarray_umath import (
ModuleNotFoundError: No module named 'numpy.core._multiarray_umath'
The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/graalpy_vfs/proj/hello.py", line 2, in <module>
    from transformers import AutoModelForCausalLM
  File "/home/arne/.local/lib/python3.10/site-packages/transformers/__init__.py", line 26, in <module>
    from . import dependency_versions_check
  File "/home/arne/.local/lib/python3.10/site-packages/transformers/dependency_versions_check.py", line 16, in <module>
    from .utils.versions import require_version, require_version_core
  File "/home/arne/.local/lib/python3.10/site-packages/transformers/utils/__init__.py", line 33, in <module>
    from .generic import (
  File "/home/arne/.local/lib/python3.10/site-packages/transformers/utils/generic.py", line 28, in <module>
    import numpy as np
  File "/home/arne/.local/lib/python3.10/site-packages/numpy/__init__.py", line 135, in <module>
    raise ImportError(msg) from e
ImportError: Error importing numpy: you should not try to import numpy from
        its source directory; please exit the numpy source tree, and relaunch
        your python interpreter from there.
timfel commented 4 months ago

Latest relase on Maven Central is 24.0.1. Is it included already? If not ... where to find the nightly builds?

It's not in the 24 release, you would have to use the dev builds from https://github.com/graalvm/graalvm-ce-dev-builds/releases

timfel commented 4 months ago

Tried to get it running with 24.0.1 ... but seems numpy is not working in current version ...

You seem to be importing from /home/arne/.local/lib/python3.10/site-packages/, this may just be a conflict with CPython 3.10. Please try to install it in a venv (https://www.graalvm.org/latest/reference-manual/python/Python-Runtime/#installing-packages).

msimacek commented 1 month ago

Many examples with transformers now work on GraalPy.

Unfortunately, all the dependencides have to be built from source during the installation, which is slow and requires you to have the right compiler and libraries in the system. Hopefully, in the future we will have prebuilt binary wheels for them.