tensorflow / text

Making text a first-class citizen in TensorFlow.
https://www.tensorflow.org/beta/tutorials/tensorflow_text/intro
Apache License 2.0
1.23k stars 345 forks source link

Errors building tensorflow-text on apple silicon even when using matching versions of tensorflow 2.10 #1077

Open tsdeng opened 1 year ago

tsdeng commented 1 year ago

I'm trying to build the tensorflow-text on a M1 Mac because tensorflow-text is not released for Apple silicon.

When compiling using ./oss_scripts/run_build.sh I see following errors:

In file included from tensorflow_text/core/kernels/byte_splitter_tflite.cc:18:
external/org_tensorflow/tensorflow/lite/kernels/shim/tflite_op_shim.h:128:35: error: no member named 'OpName' in 'tensorflow::text::ByteSplitterWithOffsetsOp<tflite::shim::Runtime::kTfLite>'
    resolver->AddCustom(ImplType::OpName(), GetTfLiteRegistration());
                        ~~~~~~~~~~^
tensorflow_text/core/kernels/byte_splitter_tflite.cc:28:53: note: in instantiation of member function 'tflite::shim::TfLiteOpKernel<tensorflow::text::ByteSplitterWithOffsetsOp>::Add' requested here
      tensorflow::text::ByteSplitterWithOffsetsOp>::Add(resolver);

I would also love to know if anyone is able to get tensorflow-text 2.10 to work on apple silicon.

Setup and reproduce

Tensorflow is installed via conda

conda install -c apple tensorflow-deps=2.10.0
python -m pip install tensorflow-macos==2.10.0
python -m pip install tensorflow-metal==0.6

Bazel is installed from home-brew

brew install bazelisk

Tensorflow-text is downloaded from release page

wget https://github.com/tensorflow/text/archive/refs/tags/v2.10.0.zip
broken commented 1 year ago

You are not compiling against TF 2.10.0 despite having it installed. That line in tflite_op_shim.h is not a function in version 2.10 and 2.11. So you must be compiling against nightly. https://github.com/tensorflow/tensorflow/blob/r2.10/tensorflow/lite/kernels/shim/tflite_op_shim.h#L128

In run_build.sh, it is not running the prepare_tf_dep.sh script if it is running on Apple silicon. That script sets up the TF dependencies to be the exact commit of the TF you are running.

Can you try running ./oss_scripts/prepare_tf_dep.sh manually? I wonder if it works and the reason it doesn't run is just because we couldn't confirm it works, or if there is a fundamental problem with how Apple sets up the library that we are not able to grab the correct commit from that script. You may want to add set -x to the top of the file to see the lines as they run if it fails.

tsdeng commented 1 year ago

@broken thanks for the quick reply I'm now encountering the checksum issue which probably is caused by https://github.blog/changelog/2023-01-30-git-archive-checksums-may-change/

I will update once the checksum issue go away.

tsdeng commented 1 year ago

Now I'm seeing different errors:

bazel-out/darwin_arm64-opt/bin/external/local_config_tf/include/tensorflow/core/framework/full_type.pb.h:17:2: error: This file was generated by an older version of protoc which is
#error This file was generated by an older version of protoc which is
 ^
bazel-out/darwin_arm64-opt/bin/external/local_config_tf/include/tensorflow/core/framework/full_type.pb.h:18:2: error: incompatible with your Protocol Buffer headers. Please
#error incompatible with your Protocol Buffer headers. Please
 ^
bazel-out/darwin_arm64-opt/bin/external/local_config_tf/include/tensorflow/core/framework/full_type.pb.h:19:2: error: regenerate this file with a newer version of protoc.
#error regenerate this file with a newer version of protoc.
tsdeng commented 1 year ago

The protobuf version is 3.19.6 which is pulled by tensorflow-macos 2.10.0

(venv) ➜  text git:(4a098cd) ✗ conda list | grep proto
protobuf                  3.19.6                   pypi_0    pypi

I don't have protoc installed.

ethiel commented 1 year ago

Same issue here, in this case with tensorflow-macos 2.11.0 and protobuf 3.19.4

tsdeng commented 1 year ago

@ethiel how did you get a higher version of tensorflow-macos and a lower version of protobuf? Is the protobuf dependency pulled by tensorflow-macos? Or did you install it separately?

ethiel commented 1 year ago

It was pulled from tensorflow. I did not install any library.

ethiel commented 1 year ago

However, after manually executing prepare_tf_dep.sh I'm facing a different issue: error: no member named 'OpName' in 'tensorflow::text::FastBertNormalizeOp'

ethiel commented 1 year ago

I'm afraid we will have to wait for Apple to provide the port for tensor flow-text... I guess there is no way to compile from source as tensorflow-macos was built with different versions, different architecture... Anyway, @tsdeng if you are able to create the wheel, I'll be the first one to say thank you.

broken commented 1 year ago

@ethiel Your error is mismatching versions. On nightly, we switched the shims from using OpName member variable to method. Likely you need to check out the TF Text branch of the version you want to build, run prepare_tf_dep, or something similar.

@tsdeng so did prepare_tf_dep.sh run fine? If so, I don't see any reason we shouldn't be running it.

@tsdeng Ugh.. This does look closer to a blocker. I have no clue what protoc tensorflow-macos uses.

If you are tenacious, you can try reaching out to their GitHub account and ask what version they are using, and update our WORKSPACE file with that version. Or more quickly, I know TF Text v2.9 was successfully built (https://github.com/sun1638650145/Libraries-and-Extensions-for-TensorFlow-for-Apple-Silicon/releases), it looks like that used version 3.9.2 (https://github.com/tensorflow/tensorflow/blob/r2.9/tensorflow/workspace2.bzl#L457). You can try copying that into our WORKSPACE file to see if it works. Bazel should ignore what TF specifies if TF Text defines it first. Make sure you are in a new directory or done a bazel clean --expunge (iirc) before rebuilding with this change.

http_archive(
        name = "com_google_protobuf",
        patch_file = ["//third_party/protobuf:protobuf.patch"],
        sha256 = "cfcba2df10feec52a84208693937c17a4b5df7775e1635c1e3baffc487b24c9b",
        strip_prefix = "protobuf-3.9.2",
        system_build_file = "//third_party/systemlibs:protobuf.BUILD",
        system_link_files = {
            "//third_party/systemlibs:protobuf.bzl": "protobuf.bzl",
            "//third_party/systemlibs:protobuf_deps.bzl": "protobuf_deps.bzl",
        },
        urls = ["https://github.com/protocolbuffers/protobuf/archive/v3.9.2.zip"],
    )

I'd remove the patch_file, system_build_file, and system_link_files to see if it works. Those may be TF specific and not needed. Otherwise, you may need to copy those files into our own third_party directory from their r2.9 branch.

Hopefully this helps.

tsdeng commented 1 year ago

@broken I did run prepare_tf_dep.sh and it finishes successfully.

After adding the http_archive of com_google_protobuf Now the error becomes

ERROR: /private/var/tmp/_bazel_tianshuo/5b5bef7c172bd140468d2c8b06d622ae/external/com_google_protobuf/BUILD:979:21: in blacklisted_protos attribute of proto_lang_toolchain rule @com_google_protobuf//:cc_toolchain: '@com_google_protobuf//:_internal_wkt_protos_genrule' does not have mandatory providers: 'ProtoInfo'. Since this rule was created by the macro 'proto_lang_toolchain', the error might have been caused by the macro implementation
ERROR: /private/var/tmp/_bazel_tianshuo/5b5bef7c172bd140468d2c8b06d622ae/external/com_google_protobuf/BUILD:979:21: Analysis of target '@com_google_protobuf//:cc_toolchain' failed
ERROR: Analysis of target '//oss_scripts/pip_package:build_pip_package' failed; build aborted:
INFO: Elapsed time: 2.021s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (173 packages loaded, 4429 targets configured)
tsdeng commented 1 year ago

I think I don't have a clear understanding of how protobuf libraries are used and linked in this case.

It seems bazel-out/darwin_arm64-opt/bin/external/local_config_tf/include/tensorflow/core/framework/full_type.pb.h is downloaded from a http archive instead of generated on my local machine using protoc. So the full_type.pb.h must be generated earlier by a protoc of version 3.9.2 and then get packaged in to the http archive.

After this http archive is downloaded to my machine, it tries to link against some protobuf runtime library and only found newer version and therefore the conflict?

What's weird is in my machine I don't have libprotobuf installed so I wonder what is it trying to link against when the error happens. @broken is my understanding correct?

In my conda environment only protobuf is installed, not libprotobuf:

(venv) ➜  text git:(4a098cd) ✗ conda list | grep protobuf
protobuf                  3.19.6                   pypi_0    pypi
DuongTSon commented 1 year ago

@tsdeng The protobuf issue can be fixed by using a patch to make it compatible with the Bazel 5.3.0.

http_archive(
    name = "com_google_protobuf",
    strip_prefix = "protobuf-3.9.2",
    urls = ["https://github.com/protocolbuffers/protobuf/archive/v3.9.2.zip"],
    patch_args = ["-p1"],
    patches = ["//third_party/protobuf:protobuf.patch"]
)

-cc_proto_library( + +cc_library( name = "cc_wkt_protos",

@@ -491,6 +489,13 @@ cc_proto_library( deps = [":cc_wkt_protos"], )

+adapt_proto_library(

+def _adapt_proto_library_impl(ctx):

However, after fixing the protobuf issue, it has another error which I cannot resolve it yet.

tensorflow_text/core/ops/regex_split_ops.cc:31:10: error: use of undeclared identifier 'OkStatus'; did you mean 'Status'?
  return OkStatus();
         ^
bazel-out/darwin_arm64-opt/bin/external/local_config_tf/include/tensorflow/core/platform/status.h:43:7: note: 'Status' declared here
class Status {
chrisoesterreichprog commented 1 year ago

https://medium.com/@murphy.crosby/building-tensorflow-and-tensorflow-text-on-a-m1-mac-9b90d55e92df

That Tutorial helped me installing Tensorflow-Text on M1 Mac

  1. Download Download these three wheels into your project: https://drive.google.com/drive/folders/1eHfUjjb5kOaQ-SZom5ldHROyr5Rwa5mh

  2. Install brew install python@3.9 python3.9 -m venv .venv source .venv/bin/activate pip install tensorflow_io_gcs_filesystem-0.27.0-cp39-cp39-macosx_13_0_arm64.whl pip install tensorflow-2.10.1-cp39-cp39-macosx_13_0_arm64.whl pip install tensorflow_text-2.10.0-cp39-cp39-macosx_11_0_arm64.whl pip install tensorflow-metal==0.6.0

DuongTSon commented 1 year ago

Finally, I have successfully built both the tensorflow-text 2.9 and 2.11. The tensorflow-macos version and the tensorflow-text version should be strictly matched. If you build the text V2.11, then you need to checkout to the 2.11 branch and install the tensorflow-macos=2.11.

  1. Install the bazelisk

    brew install bazelisk
  2. Remove the auto-configuration in the oss_script/run_build.sh file. Currently, it will update the version to the latest version of Tensorflow which causes the major incompatibility.

    # Remove or Disable those lines below
    # Set tensorflow version
    if [[ $osname != "Darwin" ]] || [[ ! $(sysctl -n machdep.cpu.brand_string) =~ "Apple" ]]; then
    source oss_scripts/prepare_tf_dep.sh
    fi
  3. Create a virtual env with the anaconda ARM version. For example, env name is tf2.11

    conda create --name tf2.11 python=3.10
    conda activate tf2.11
  4. Install tensorflow-macos and tensorflow-metal

    pip install tensorflow-macos==2.11.0
    pip install tensorflow-metal==0.7.1
  5. Run the build

    ./oss_scripts/run_build.sh
ethiel commented 1 year ago

Thanks for the guide, @DuongTSon. Where did you find the tensorflow-deps==2.11.0?. I tried but the most updated version is 2.9.0 in my channels.

broken commented 1 year ago

This is great @DuongTSon!

With step 2, are you saying that the conditional is not failing so it's executing the prepare_tf_dep.sh inproperly, or that you need it to execute that line so we need to comment it out? I'm trying to figure out what we need to do to fix it so you no longer need to worry about it.

DuongTSon commented 1 year ago

@ethiel Sorry, my mistake. Actually we do not need to install tensorflow-deps to build the tensorflow-text on M1 MacOS. Updated my answer!

DuongTSon commented 1 year ago

@broken Just try to fix that line, actually the condition $osname should be in lowercase.

# Set tensorflow version
if [[ $osname != "darwin" ]] || [[ ! $(sysctl -n machdep.cpu.brand_string) =~ "Apple" ]]; then
  source oss_scripts/prepare_tf_dep.sh
fi

I have tested this version. It work well on Mac M1 now.

ethiel commented 1 year ago

@DuongTSon thanks, I didn't installed anyway because I didn't find it. I'm able to build from source, but I can't use it. This is the error when I try to use it in a simple project:

tensorflow.python.framework.errors_impl.NotFoundError: dlopen(/Users/ethiel/miniconda3/envs/python310-tensorflow/lib/python3.10/site-packages/tensorflow_text/python/ops/_regex_split_ops.dylib, 0x0006): malformed trie child, cycle to nodeOffset=0x2
weak-def symbol not found (__ZN10tensorflow11register_op19OpDefBuilderWrapper10SetShapeFnENSt3__18functionIFN3tsl6StatusEPNS_15shape_inference16InferenceContextEEEE)

I guess the issue is python 3.10.

Python 3.9 with tensorflow-macos 2.9.0 works fine. The issue seems to be in the mix of tensorflow-macos 2.11.0 and the Python version.

mridulrao commented 1 year ago

Hey! I was able to create wheel for tensorflow-text==2.10.0 and install tensorflow-macos==2.10.0. But when I try to import them in jupyter notebook, it throws an error dlopen(/Users/kawaii/opt/miniconda3/envs/transformers_2/lib/python3.10/site-packages/tensorflow_text/python/ops/_regex_split_ops.dylib, 0x0006): malformed trie child, cycle to nodeOffset=0x2 weak-def symbol not found (__ZN10tensorflow11register_op19OpDefBuilderWrapper10SetShapeFnENSt3__18functionIFNS_6StatusEPNS_15shape_inference16InferenceContextEEEE)

Any idea how to solve it?

DuongTSon commented 1 year ago

@mridulrao it's tensorflow-metal version issue. You can use tensorflow-metal==0.6.0, I have tested with tensorflow-text 2.10. It's working well.

ethiel commented 1 year ago

@DuongTSon Did you build 2.10 from source?. I can't, the build fails with this error: Compiling tensorflow_text/core/ops/constrained_sequence_op.cc failed: undeclared inclusion(s) in rule '//tensorflow_text:constrained_sequence_op_cc': this rule is missing dependency declarations for the following files included by 'tensorflow_text/core/ops/constrained_sequence_op.cc': 'external/com_google_absl/absl/status/status.h' 'external/com_google_absl/absl/status/internal/status_internal.h'

mridulrao commented 1 year ago

@ethiel I was getting the same issue but was able to solve it by using tensorflow==2.10.0 and bazel==5.1.1

mridulrao commented 1 year ago

@DuongTSon I am still getting the same error after downgrading the metal version. I removed tensorflow-metal from the environment but still got the error. The error is coming from import tensorflow_text line

DuongTSon commented 1 year ago

@mridulrao I have built the 2.10 from source and tested the Transformer without any issue. I guess it's python version, miniconda might lack some functions.

My build environment is like the following

mridulrao commented 1 year ago

@DuongTSon Okay! I will try your system configuration. Moreover, if possible can you help me understand Transformers architecture and solve a few bugs? I have been following the transformer tutorial and used MuRIL encoder.

alanlomeli commented 1 year ago

Could I use I precompiled wheel for macos 13 and if yes could someone provide me one? I followed @DuongTSon instructions but when I build it gets stuck :(

DuongTSon commented 1 year ago

Could I use I precompiled wheel for macos 13 and if yes could someone provide me one? I followed @DuongTSon instructions but when I build it gets stuck :(

I am not sure whether my builds in MacOS Monterey can work on your computer but you can try. There are 3 versions of tensorflow-text 2.9, 2.10, 2.11 in the link below. https://1drv.ms/u/s!AmRjIZct7QZDg9xM-rf_kE_svoEcJA?e=IvpE7R

DuongTSon commented 1 year ago

@DuongTSon Okay! I will try your system configuration. Moreover, if possible can you help me understand Transformers architecture and solve a few bugs? I have been following the transformer tutorial and used MuRIL encoder.

I just tried the Transformer with tensorflow-text in this tutorial https://www.tensorflow.org/text/tutorials/transformer. I havent used the MuRIL encoder!

ethiel commented 1 year ago

I don't understand... Maybe the difference is the clang version or Xcode version. I'm able to build every version but I could only use the version you uploaded, @DuongTSon. tensorflow_text-2.11.0-cp310-cp310-macosx_11_0_arm64.whl. I tried different python versions, and it's working (your wheel) with python 3.10.0. I tried to build again with an empty Conda environment, 3.10.0, I was able to build the wheel, but I can't use it: _regex_split_ops.dylib, 0x0006): malformed trie child, cycle to nodeOffset=0x

So, just to be sure, what's your version of python?. I mean, the version you used to build the code.

This is my environment: clang 14.0.0. Xcode 14.2 tensorflow-macos 2.11.0 tensorflow-metal 0.7.0 python: 3.10.0

Can you share your pip list output?. I guess there is an issue with some python library, clang or whatever.

DuongTSon commented 1 year ago

@ethiel I manage to reproduce your error with the version 2.10 by using the XCode14.2. I guess there was an issue with the XCode 14.2 or clang 14.0. The environment that has worked for me is:

ethiel commented 1 year ago

Perfect., @DuongTSon , thank you. I guess this issue can be closed, by updating the README to highlight the current version working as well as the environment working.

vedmant commented 1 year ago

No instruction from this thread worked for me, is there any clear instruction how to build and run it on arm64?

For example when I try to build I have following error:

# NOTE: Update Bazel version in tensorflow/tools/ci_build/release/common.sh.oss'
+ '[' 5.3.0 '!=' '5.3.0
# NOTE: Update Bazel version in tensorflow/tools/ci_build/release/common.sh.oss' ']'
+ echo 'Incorrect version of Bazel installed.'

Which says that 5.3.0 '!=' '5.3.0

DuongTSon commented 1 year ago

No instruction from this thread worked for me, is there any clear instruction how to build and run it on arm64?

For example when I try to build I have following error:

# NOTE: Update Bazel version in tensorflow/tools/ci_build/release/common.sh.oss'
+ '[' 5.3.0 '!=' '5.3.0
# NOTE: Update Bazel version in tensorflow/tools/ci_build/release/common.sh.oss' ']'
+ echo 'Incorrect version of Bazel installed.'

Which says that 5.3.0 '!=' '5.3.0

I think you are using the master branch to build. I had this issue before, it was because of the error in the script oss_scripts/run_build.sh. You can use the stable versions by checking out at the stable branch such as 2.11, 2.10, etc...

ethiel commented 1 year ago

I'm able to build every branch (2.9, 2.10 and 2.11) My issue is related to the clang version, that's why I can't use them later, but compilation and build works like a charm. @vedmant are you sure you are following the instructions step by step?

broken commented 1 year ago

@broken Just try to fix that line, actually the condition $osname should be in lowercase.

# Set tensorflow version
if [[ $osname != "darwin" ]] || [[ ! $(sysctl -n machdep.cpu.brand_string) =~ "Apple" ]]; then
  source oss_scripts/prepare_tf_dep.sh
fi

I have tested this version. It work well on Mac M1 now.

@DuongTSon I had somebody with a M1 test this, and uname -s returns "Darwin" (with a capital D). By lower-casing it, the prepare_tf_dep.sh script is being ran. However, the original steps you provided said to comment all of it out. This seems contradictory. Do you know if this should be ran or not?

vedmant commented 1 year ago

@ethiel For me complication doesn't work, version 2.10.0, I have this error:

++ sed -E -i '""' 's/strip_prefix = "tensorflow-2.+",/strip_prefix = "tensorflow-      <span class="sha-block m-0">commit <span class="sha user-select-contain">53ce211</span></span>",/' WORKSPACE
sed: 1: "s/strip_prefix = "tenso ...": bad flag in substitute command: 's'

Looks like the problem starts here: oss_scripts/prepare_tf_dep.sh python3 -c 'print(__import__("tensorflow").__git_version__)' it returns unknown and I think script supposed to get a git version. Then it tries to get commit_sha from github: commit_sha=$(curl -SsL https://github.com/tensorflow/tensorflow/commit/${short_commit_sha} | grep sha-block | grep commit | sed -e 's/.*\([a-f0-9]\{40\}\).*/\1/') which returns html code instead: <span class="sha-block m-0">commit <span class="sha user-select-contain">b5c7f1f</span></span>

So the question is why python3 -c 'print(__import__("tensorflow").__git_version__)' returns unknown?

DuongTSon commented 1 year ago

@broken Sorry for the confusion. In the tutorial, I mean we need to disable the prepare_tf_dep.sh script when compiling on MacOs M1 because it will update the tensorflow version in the WORKSPACE file to the latest one which causes the incompatible issue.

About the lowercase darwin, I thought you also have tried to set a rule to avoid updating the tensorflow version with the $osname condition check but somehow the condition did not capture the correct osname in MacOS Monterey. Then I have tried to fix that with the lowercase darwin.

In summary, to make the build work with the MacOS M1, we need to disable the prepare_tf_dep.sh by some approaches. Either ways I have listed above work for me.

vedmant commented 1 year ago

@DuongTSon OK, I was able to build it, but now there is an error when I try to import:

 from tensorflow_text.core.pybinds import tflite_registrar
ImportError: dlopen(/Users/vedmant/miniconda3/lib/python3.10/site-packages/tensorflow_text/core/pybinds/tflite_registrar.so, 0x0002): tried: '/Users/vedmant/miniconda3/lib/python3.10/site-packages/tensorflow_text/core/pybinds/tflite_registrar.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e'))

However I see mentions during the build and it clearly mentions arm64:

copying tensorflow_text/core/pybinds/tflite_registrar.so -> build/lib.macosx-11.1-arm64-cpython-310/tensorflow_text/core/pybinds
copying build/lib.macosx-11.1-arm64-cpython-310/tensorflow_text/core/pybinds/tflite_registrar.so -> build/bdist.macosx-11.1-arm64/wheel/tensorflow_text/core/pybinds
broken commented 1 year ago

@DuongTSon Got it; thanks.

@vedmant The package you are trying to import was built for the wrong architecture. Some ideas:

vedmant commented 1 year ago

@broken Yes for all questions, installed by pip, the same computer the same environment, and same date. Even if I try to install just tensorflow_text==2.10.0 I have error ERROR: Could not find a version that satisfies the requirement tensorflow_text==2.10.0 (from versions: none) co I could not install x64 version in any case.

broken commented 1 year ago

That's very strange. If you are positive that it's the lib you built, then you must be building it for the wrong architecture somehow. I think your best bet is Google for how to ensure you are building for the right architecture. For example, this page suggests you may have multiple architecture implementations of LLVM installed, check your clang target (clang -v), and possibly set ARCHPREFERENCE (if it is used).

But first, to double verify it is the build, I would unzip your package (just rename with a .zip extension and unzip), and then use the file command (file tflite_registrar.so) to verify they were built for the right architecture. Then you will know for certain whether your created package was built incorrectly or there was an installation issue.

tsdeng commented 1 year ago

Whoever hitting this problem, here's the solution for everybody using MacOS 13 and want to build tensorflow-text. This is the kind of thing that should not be this hard. Thanks to @DuongTSon and https://medium.com/@murphy.crosby/building-tensorflow-and-tensorflow-text-on-a-m1-mac-9b90d55e92df

Please make sure:

  1. Use conda. Because you will need tensorflow-metal, tensorflow-deps and tensorflow-macos
  2. Download Xcode 13.1 from here. Even though you won't be able to install Xcode 13.1, you still need the older version ld to workaround the issue mentioned here

Following are the steps.

Create a conda environment:

conda create -p ./venv python=3.10
conda activate ./venv

Install tensorflow macOS dependencies:

conda install -c apple tensorflow-deps=2.10.0
python -m pip install tensorflow-macos==2.10.0
python -m pip install 'tensorflow-metal==0.6.0'

clone the tensorflow-text repo and checkout 2.10.0 branch

git clone https://github.com/tensorflow/text.git
git checkout 2.10

Comment out following lines in oss_scripts/run_build.sh:

# if [[ $osname != "darwin" ]] || [[ ! $(sysctl -n machdep.cpu.brand_string) =~ "Apple" ]]; then
#  source oss_scripts/prepare_tf_dep.sh
#fi

Backup you ld and replace it with the older version of ld in Xcode 13.1. Here I assume you downloaded Xcode 13 in ~/Downloads folder.

sudo mv /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ld ./ld.backup
sudo cp ~/Downloads/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ld /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/

Run ld -v to make sure your ld version is ld64-711

Now run ./oss_scripts/run_build.sh and the wheel will be produced.

I really really hope the tensorflow team can improve the dev experience on Apple Silicon given the huge amount of devs using Mac and all new Macs are on Apple Silicon.

vedmant commented 1 year ago

@tsdeng Thanks, I'll try this, can you upload a built whl file, maybe I just can install it?

Actually I was able to build this way as well, but using all this steps in your example. However I have the same error when I try to run it:

ImportError: dlopen(/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow_text/core/pybinds/tflite_registrar.so, 0x0002): tried: '/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow_text/core/pybinds/tflite_registrar.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e'))

tsdeng commented 1 year ago

@vedmant I will upload a wheel tonight. You environment seems to have some issues. In no way should you get x86_64. Are you unintentionally running Rosetta? Did you install the apple silicon version of Vonda?

vedmant commented 1 year ago

@tsdeng Not running rosetta, installed environment conda create -p ./venv python=3.10 using miniconda3. If I run python -c "import platform; print(platform.machine());" it returns arm64 in this environment.

tsdeng commented 1 year ago

@vedmant I am using miniforge3 which has packages better supporting apple silicon. I would suggest you to give miniforge a try.

Here is the wheel I built.

vedmant commented 1 year ago

@tsdeng Thanks, this worked, but I have different error now:

``` Metal device set to: Apple M1 Pro systemMemory: 16.00 GB maxCacheSize: 5.33 GB 2023-03-05 20:20:07.106507: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2023-03-05 20:20:07.106915: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: ) 2023-03-05 20:20:20.261844: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz 2023-03-05 20:20:22.996269: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled. 2023-03-05 20:20:23.221834: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled. Traceback (most recent call last): File "/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow/python/eager/function.py", line 1613, in _call_impl return self._call_with_structured_signature(args, kwargs, File "/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow/python/eager/function.py", line 1691, in _call_with_structured_signature self._structured_signature_check_missing_args(args, kwargs) File "/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow/python/eager/function.py", line 1710, in _structured_signature_check_missing_args raise TypeError(f"{self._structured_signature_summary()} missing " TypeError: signature_wrapper(*, text) missing required arguments: text. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/Users/vedmant/Projects/_Project/WF-01-project/project-backend/ML/predicting.py", line 77, in output = predicting(title=args.title, model_path=args.model_path, label_dict=args.label_dict) File "/Users/vedmant/Projects/_Project/WF-01-project/project-backend/ML/predicting.py", line 61, in predicting pred = predicts(str(title), model, decode=dec) File "/Users/vedmant/Projects/_Project/WF-01-project/project-backend/ML/util.py", line 61, in predicts return decode[np.argmax(infer(tf.constant(x))['classifier'].numpy()[0])] File "/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow/python/eager/function.py", line 1604, in __call__ return self._call_impl(args, kwargs) File "/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow/python/eager/function.py", line 1617, in _call_impl return self._call_with_flat_signature(args, kwargs, File "/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow/python/eager/function.py", line 1671, in _call_with_flat_signature return self._call_flat(args, self.captured_inputs, cancellation_manager) File "/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow/python/saved_model/load.py", line 138, in _call_flat return super(_WrapperFunction, self)._call_flat(args, captured_inputs, File "/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow/python/eager/function.py", line 1862, in _call_flat return self._build_call_outputs(self._inference_function.call( File "/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow/python/eager/function.py", line 499, in call outputs = execute.execute( File "/Users/vedmant/Downloads/text-2.10.0/venv/lib/python3.10/site-packages/tensorflow/python/eager/execute.py", line 54, in quick_execute tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, tensorflow.python.framework.errors_impl.NotFoundError: Graph execution error: No registered 'AddN' OpKernel for 'GPU' devices compatible with node {{node StatefulPartitionedCall/model/preprocessing/StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/bert_pack_inputs/PartitionedCall/RaggedConcat/ArithmeticOptimizer/AddOpsRewrite_Leaf_0_add_2}} (OpKernel was found, but attributes didn't match) Requested Attributes: N=2, T=DT_INT64, _XlaHasReferenceVars=false, _grappler_ArithmeticOptimizer_AddOpsRewriteStage=true, _device="/job:localhost/replica:0/task:0/device:GPU:0" . Registered: device='XLA_CPU_JIT'; T in [DT_FLOAT, DT_DOUBLE, DT_INT32, DT_UINT8, DT_INT16, 16534343205130372495, DT_COMPLEX128, DT_HALF, DT_UINT32, DT_UINT64, DT_VARIANT] device='GPU'; T in [DT_FLOAT] device='DEFAULT'; T in [DT_INT32] device='CPU'; T in [DT_UINT64] device='CPU'; T in [DT_INT64] device='CPU'; T in [DT_UINT32] device='CPU'; T in [DT_UINT16] device='CPU'; T in [DT_INT16] device='CPU'; T in [DT_UINT8] device='CPU'; T in [DT_INT8] device='CPU'; T in [DT_INT32] device='CPU'; T in [DT_HALF] device='CPU'; T in [DT_BFLOAT16] device='CPU'; T in [DT_FLOAT] device='CPU'; T in [DT_DOUBLE] device='CPU'; T in [DT_COMPLEX64] device='CPU'; T in [DT_COMPLEX128] device='CPU'; T in [DT_VARIANT] [[StatefulPartitionedCall/model/preprocessing/StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/bert_pack_inputs/PartitionedCall/RaggedConcat/ArithmeticOptimizer/AddOpsRewrite_Leaf_0_add_2]] [Op:__inference_signature_wrapper_34568] 2023-03-05 20:20:23.893251: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled. ```

It works however if I run on CPU only.

tsdeng commented 1 year ago

This is a known issue: https://developer.apple.com/forums/thread/711402