Closed xwli-chelsea closed 3 years ago
Adding more details:
I also tested this example here: https://github.com/tensorflow/text/blob/master/examples/keras_example_174.ipynb
Similar as before, I'm getting segmentation fault from the source build but no error when installing with the published wheel. I'm using a Redhat 7.6(Maipo) machine, with bazel==3.5.0.
Are you building with the same system that you are running the package on? Building on one system and running on another could be the issue.
Normally segfaults can occur due to ABI errors, which could be caused by the above. I believe TF provides a docker image that you can try building on if you want a more universal binary. Though from issues I've seen others have, I don't know how updated they keep this image, and it may have changed since they built the 2.2 branch. It could be worth trying though.
Another thing to try is using the manylinux2020 toolchain when building. bazel build --config=manylinux2020 oss_scripts/pip_package:build_pip_package
ps. Apologies for the delayed reply; holidays and ongoing 2.4 release has diverted a lot of our attention.
Thanks @broken for the detailed response. Yes after building text using the same env for our TF build, it went away. So it did came from differences in the docker we used. Will close the issue and happy holidays!
Hi,
We've built tensorflow-text (2.2.0) from source using the scripts from
./oss_scripts
. Our goal is to use the BertTokenizer as a keras layer. There were issues related to saving such models as discussed here: https://github.com/tensorflow/text/issues/224. We followed the one of the workaround provided in the thread (with code in the notebook: https://github.com/tensorflow/text/issues/224#issuecomment-644631076) and it works fine if we install tf text throughpip install tensorflow-text==2.2.0
.However, when building from source, I'm getting segmentation fault with the exact same code at
tokens = self.bert_tokenizer.tokenize(text)
.I also tested using the tokenizer directly without a custom layer. It works fine and I can get output tokens.
I'm not sure what I did wrong here. Do I also need to build
tensorflow
from source instead of using the pip installed version? Could you please help with this issue? Any insight is appreciated!