Open coppock opened 1 year ago
@arjunsuresh Any idea?
I'm not using the docker container - running the reference implementation using bert TF model on Ubuntu 22.04, python 3.10 and tensorflow 2.10 I'm getting the below error.
Use tf.gfile.GFile.
Traceback (most recent call last):
File "/home/ubuntu/CM/repos/local/cache/e3b48cdce19e4f4a/inference/language/bert/run.py", line 120, in <module>
main()
File "/home/ubuntu/CM/repos/local/cache/e3b48cdce19e4f4a/inference/language/bert/run.py", line 68, in main
sut = get_tf_sut(args)
File "/home/ubuntu/CM/repos/local/cache/e3b48cdce19e4f4a/inference/language/bert/tf_SUT.py", line 79, in get_tf_sut
return BERT_TF_SUT(args)
File "/home/ubuntu/CM/repos/local/cache/e3b48cdce19e4f4a/inference/language/bert/tf_SUT.py", line 43, in __init__
graph_def.ParseFromString(f.read())
google.protobuf.message.DecodeError: Error parsing message with type 'tensorflow.GraphDef'
Finished destroying SUT.
Ubuntu 20.04 with Python 3.8, version 2.10 tensorflow, and version 3.19 google.protobuf works fine for running BERT. As @arjunsuresh, I'm running without Docker. I'll draft a Dockerfile.
Thank you @coppock . protobuf is the culprit. Using protobuf version 3.19
I'm able to run on Ubuntu 22.04, python 3.10 and tensorflow 2.10. May be we should recreate this file using newer protobuf.
@pgmpablo157321 to look at this issue
On r2.1, the Docker container run fails as shown:
Looking into this, I suspected an out of memory condition on my GPU, but I'm using an NVIDIA A30 with 24GB of memory. I would think that's plenty enough. In case it's helpful, I'm running on Ubuntu 20.04 with NVIDIA driver version 520.61.05 and CUDA version 11.8.