Open duncantech opened 2 months ago
/assigntome
torch_xla
[Success]pip install torch_xla
cpu
Plugin [Failed]# Build wheel
pip wheel plugins/cpu -v
# Or install directly
pip install plugins/cpu -v
Similar issue was encountered as mentioned in https://github.com/pytorch/xla/issues/7184#issuecomment-2148759661
bazel
[Success]brew install bazel
bazel
version mismatch [Success] ERROR: The project you're trying to build requires Bazel 6.5.0 (specified in /Users/tej/Documents/GitHub-Repositories/MachineLearning/Docathon-2024/xla/.bazelversion), but it wasn't found in /opt/homebrew/Cellar/bazel/7.1.2/libexec/bin.
cd "/opt/homebrew/Cellar/bazel/7.1.2/libexec/bin" && curl -fLO https://releases.bazel.build/6.5.0/release/bazel-6.5.0-darwin-arm64 && chmod +x bazel-6.5.0-darwin-arm64
Following was added to .bazelrc
build --cxxopt=-std=gnu++17
build --host_cxxopt=-std=gnu++17
$ pip install plugins/cpu -v
Using pip 23.3.1 from /Users/tej/anaconda3/envs/PyTorch/lib/python3.11/site-packages/pip (python 3.11)
Processing ./plugins/cpu
Running command pip subprocess to install build dependencies
Collecting setuptools
Using cached setuptools-70.0.0-py3-none-any.whl.metadata (5.9 kB)
Using cached setuptools-70.0.0-py3-none-any.whl (863 kB)
Installing collected packages: setuptools
Successfully installed setuptools-70.0.0
Installing build dependencies ... done
Running command Getting requirements to build wheel
bazel build //plugins/cpu:pjrt_c_api_cpu_plugin.so --symlink_prefix=/Users/tej/Documents/GitHub-Repositories/MachineLearning/Docathon-2024/xla/plugins/cpu/bazel- --remote_default_exec_properties=cache-silo-key=dev
INFO: Options provided by the client:
Inherited 'common' options: --isatty=0 --terminal_columns=80
INFO: Reading rc options for 'build' from /Users/tej/Documents/GitHub-Repositories/MachineLearning/Docathon-2024/xla/.bazelrc:
Inherited 'common' options: --experimental_repo_remote_exec
INFO: Reading rc options for 'build' from /Users/tej/Documents/GitHub-Repositories/MachineLearning/Docathon-2024/xla/.bazelrc:
'build' options: --announce_rc --nocheck_visibility --enable_platform_specific_config --experimental_cc_shared_library --define=no_aws_support=true --define=no_hdfs_support=true --define=no_hdfs_support=true --define=no_kafka_support=true --define=no_ignite_support=true --define=grpc_no_ares=true -c opt --config=short_logs --action_env=CC=gcc --action_env=CXX=g++ --spawn_strategy=standalone --incompatible_strict_action_env --noremote_upload_local_results --java_runtime_version=remotejdk_11 --define=tf_api_version=2 --action_env=TF2_BEHAVIOR=1 --define framework_shared_object=false --define tsl_protobuf_header_only=false --define=use_fast_cpp_protos=true --define=allow_oversize_protos=true --define=with_xla_support=true --noincompatible_remove_legacy_whole_archive --experimental_link_static_libraries_once=false --incompatible_enforce_config_setting_visibility --cxxopt=-std=gnu++17 --host_cxxopt=-std=gnu++17
INFO: Found applicable config definition build:short_logs in file /Users/tej/Documents/GitHub-Repositories/MachineLearning/Docathon-2024/xla/.bazelrc: --output_filter=DONT_MATCH_ANYTHING
Loading:
Loading:
Loading: 0 packages loaded
INFO: Build options --cxxopt and --host_cxxopt have changed, discarding analysis cache.
Analyzing: target //plugins/cpu:pjrt_c_api_cpu_plugin.so (0 packages loaded, 0 targets configured)
INFO: Analyzed target //plugins/cpu:pjrt_c_api_cpu_plugin.so (1 packages loaded, 10840 targets configured).
checking cached actions
INFO: Found 1 target...
[1 / 5] [Prepa] BazelWorkspaceStatusAction stable-status.txt
[249 / 1,676] Compiling llvm/lib/Demangle/RustDemangle.cpp [for tool]; 1s local ... (7 actions, 6 running)
[381 / 1,889] Compiling src/google/protobuf/compiler/zip_writer.cc [for tool]; 1s local ... (7 actions, 6 running)
[1,621 / 3,679] Compiling src/google/protobuf/compiler/code_generator.cc [for tool]; 2s local ... (7 actions, 6 running)
[2,685 / 5,997] Compiling src/google/protobuf/compiler/python/helpers.cc [for tool]; 2s local ... (6 actions running)
[2,952 / 6,589] Compiling xla/ef57.cc; 2s local ... (7 actions running)
[2,956 / 6,589] Compiling src/google/protobuf/compiler/python/pyi_generator.cc [for tool]; 3s local ... (5 actions running)
[6,588 / 6,589] Linking plugins/cpu/pjrt_c_api_cpu_plugin.so; 0s local
ERROR: /Users/tej/Documents/GitHub-Repositories/MachineLearning/Docathon-2024/xla/plugins/cpu/BUILD:17:14: Linking plugins/cpu/pjrt_c_api_cpu_plugin.so failed: (Exit 1): cc_wrapper.sh failed: error executing command (from target //plugins/cpu:pjrt_c_api_cpu_plugin.so) external/local_config_cc/cc_wrapper.sh @bazel-out/darwin_arm64-opt/bin/plugins/cpu/pjrt_c_api_cpu_plugin.so-2.params
ld: unknown options: --version-script --no-undefined
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Target //plugins/cpu:pjrt_c_api_cpu_plugin.so failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 9.527s, Critical Path: 4.55s
INFO: 27 processes: 2 internal, 25 local.
FAILED: Build did NOT complete successfully
Traceback (most recent call last):
File "/Users/tej/anaconda3/envs/PyTorch/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
main()
File "/Users/tej/anaconda3/envs/PyTorch/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/tej/anaconda3/envs/PyTorch/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
return hook(config_settings)
^^^^^^^^^^^^^^^^^^^^^
File "/private/var/folders/wc/rkrv7ck92zd4f_3qgk8q2gn00000gn/T/pip-build-env-3vcjdfr7/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel
return self._get_build_requires(config_settings, requirements=['wheel'])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/private/var/folders/wc/rkrv7ck92zd4f_3qgk8q2gn00000gn/T/pip-build-env-3vcjdfr7/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires
self.run_setup()
File "/private/var/folders/wc/rkrv7ck92zd4f_3qgk8q2gn00000gn/T/pip-build-env-3vcjdfr7/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 311, in run_setup
exec(code, locals())
File "<string>", line 10, in <module>
File "/Users/tej/Documents/GitHub-Repositories/MachineLearning/Docathon-2024/xla/plugins/cpu/../../build_util.py", line 67, in bazel_build
subprocess.check_call(bazel_argv, stdout=sys.stdout, stderr=sys.stderr)
File "/Users/tej/anaconda3/envs/PyTorch/lib/python3.11/subprocess.py", line 413, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['bazel', 'build', '//plugins/cpu:pjrt_c_api_cpu_plugin.so', '--symlink_prefix=/Users/tej/Documents/GitHub-Repositories/MachineLearning/Docathon-2024/xla/plugins/cpu/bazel-', '--remote_default_exec_properties=cache-silo-key=dev']' returned non-zero exit status 1.
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
full command: /Users/tej/anaconda3/envs/PyTorch/bin/python /Users/tej/anaconda3/envs/PyTorch/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py get_requires_for_build_wheel /var/folders/wc/rkrv7ck92zd4f_3qgk8q2gn00000gn/T/tmpn1xqffzo
cwd: /Users/tej/Documents/GitHub-Repositories/MachineLearning/Docathon-2024/xla/plugins/cpu
Getting requirements to build wheel ... error
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
$ python -V
Python 3.11.9
$ pip list
Package Version
------------------------- --------------
accelerate 0.30.0.dev0
aiohttp 3.9.5
aiosignal 1.3.1
anyio 4.3.0
appnope 0.1.4
argon2-cffi 23.1.0
argon2-cffi-bindings 21.2.0
arrow 1.3.0
asttokens 2.4.1
async-lru 2.0.4
attrs 23.2.0
audioread 3.0.1
Babel 2.14.0
beautifulsoup4 4.12.3
bitsandbytes 0.42.0
bleach 6.1.0
certifi 2024.2.2
cffi 1.16.0
charset-normalizer 3.3.2
comm 0.2.2
contourpy 1.2.1
cycler 0.12.1
datasets 2.19.1
debugpy 1.8.1
decorator 5.1.1
defusedxml 0.7.1
dill 0.3.8
executing 2.0.1
fastjsonschema 2.19.1
filelock 3.13.4
fonttools 4.51.0
fqdn 1.5.1
frozenlist 1.4.1
fsspec 2024.3.1
h11 0.14.0
httpcore 1.0.5
httpx 0.27.0
huggingface-hub 0.22.2
idna 3.7
ipykernel 6.29.4
ipython 8.24.0
isoduration 20.11.0
jedi 0.19.1
Jinja2 3.1.3
joblib 1.4.2
json5 0.9.25
jsonpointer 2.4
jsonschema 4.21.1
jsonschema-specifications 2023.12.1
jupyter_client 8.6.1
jupyter_core 5.7.2
jupyter-events 0.10.0
jupyter-lsp 2.2.5
jupyter_server 2.14.0
jupyter_server_terminals 0.5.3
jupyterlab 4.1.8
jupyterlab_pygments 0.3.0
jupyterlab_server 2.27.1
kiwisolver 1.4.5
lazy_loader 0.4
librosa 0.10.2
llvmlite 0.42.0
MarkupSafe 2.1.5
matplotlib 3.8.4
matplotlib-inline 0.1.7
mistune 3.0.2
mpmath 1.3.0
msgpack 1.0.8
multidict 6.0.5
multiprocess 0.70.16
nbclient 0.10.0
nbconvert 7.16.3
nbformat 5.10.4
nest-asyncio 1.6.0
networkx 3.3
notebook 7.1.3
notebook_shim 0.2.4
numba 0.59.1
numpy 1.26.4
overrides 7.7.0
packaging 24.0
pandas 2.2.2
pandocfilters 1.5.1
parso 0.8.4
pexpect 4.9.0
pillow 10.3.0
pip 23.3.1
platformdirs 4.2.1
pooch 1.8.1
prometheus_client 0.20.0
prompt-toolkit 3.0.43
psutil 5.9.8
ptyprocess 0.7.0
pure-eval 0.2.2
pyarrow 16.0.0
pyarrow-hotfix 0.6
pycparser 2.22
Pygments 2.17.2
pyparsing 3.1.2
python-dateutil 2.9.0.post0
python-json-logger 2.0.7
pytube 15.0.0
pytz 2024.1
PyYAML 6.0.1
pyzmq 26.0.2
referencing 0.35.0
regex 2024.5.10
requests 2.31.0
rfc3339-validator 0.1.4
rfc3986-validator 0.1.1
rpds-py 0.18.0
safetensors 0.4.3
scikit-learn 1.4.2
scipy 1.13.0
seaborn 0.13.2
Send2Trash 1.8.3
sentencepiece 0.2.0
setuptools 68.2.2
six 1.16.0
sniffio 1.3.1
soundfile 0.12.1
soupsieve 2.5
soxr 0.3.7
stack-data 0.6.3
sympy 1.12
terminado 0.18.1
threadpoolctl 3.5.0
tinycss2 1.3.0
tokenizers 0.19.1
torch 2.3.0
torch-xla 1.0
torchaudio 2.3.0
torchvision 0.18.0
tornado 6.4
tqdm 4.66.2
traitlets 5.14.3
transformers 4.40.2
types-python-dateutil 2.9.0.20240316
typing_extensions 4.11.0
tzdata 2024.1
uri-template 1.3.0
urllib3 2.2.1
wcwidth 0.2.13
webcolors 1.13
webencodings 0.5.1
websocket-client 1.8.0
wheel 0.41.2
xgboost 2.0.3
xxhash 3.4.1
yarl 1.9.4
$ system_profiler SPSoftwareDataType SPHardwareDataType
Software:
System Software Overview:
System Version: macOS 14.4.1 (23E224)
Kernel Version: Darwin 23.4.0
Boot Volume: Macintosh HD
...
Hardware:
Hardware Overview:
Model Name: MacBook Air
Chip: Apple M1
Total Number of Cores: 8 (4 performance and 4 efficiency)
Memory: 8 GB
...
@duncantech May I know if this is what's expected? Or is there something wrong with what I'm doing?
real error seems to be
ERROR: /Users/tej/Documents/GitHub-Repositories/MachineLearning/Docathon-2024/xla/plugins/cpu/BUILD:17:14: Linking plugins/cpu/pjrt_c_api_cpu_plugin.so failed: (Exit 1): cc_wrapper.sh failed: error executing command (from target //plugins/cpu:pjrt_c_api_cpu_plugin.so) external/local_config_cc/cc_wrapper.sh @bazel-out/darwin_arm64-opt/bin/plugins/cpu/pjrt_c_api_cpu_plugin.so-2.params
ld: unknown options: --version-script --no-undefined
clang: error: linker command failed with exit code 1 (use -v to see invocation)
I asked bard and it told me
" Platform incompatibility: These options might be specific to certain platforms or linkers. For example, --no-undefined is generally used with the GNU linker, and it may not be supported on other linkers like the one Apple uses for macOS. Similarly, --version-script is used to control symbol versions and might not be available on all platforms. " I am guessing ARM CPU build does not work out of the box and require us tweaking the build config.
📚 Documentation
Install the CPU PJRT plugin from the instructions here: https://github.com/pytorch/xla/blob/master/plugins/cpu/README.md
Next try getting a model to run on a ARM CPU, if it works, create a tutorial on how to get it running.