dial tcp: lookup t2v-transformers on 127.0.0.11:53: server misbehaving

Valkea commented 7 months ago

When the pre-processing pipeline is run through the Prefect agent instance which is in turn hosted on the AWS EC2 instance, the t2v-transformers container seems to have a problem accessing the port 53.

agent-1 | {'error': [{'message': 'update vector: send POST request: Post "http://t2v-transformers:8080/vectors": dial tcp: lookup t2v-transformers on 127.0.0.11:53: server misbehaving'}]}

However, the problem doesn't occur when the docker-compose is run locally.

Here are two samples of the same step on both the remote then the local instances.

REMOTE LOG

agent-1             | 14:27:03.717 | INFO    | Task run 'Populate VectorDatabase-0' - There are 0 entries from PowerBI_638184922783297887
agent-1             | 14:27:03.719 | INFO    | Task run 'Populate VectorDatabase-0' - Adding the 1 entries
agent-1             | 14:27:03.735 | INFO    | Task run 'Populate VectorDatabase-0' - 2153/2155 | Dealing with PowerBI_638184627267335769
agent-1             | 14:27:03.741 | INFO    | Task run 'Populate VectorDatabase-0' - There are 0 entries from PowerBI_638184627267335769
agent-1             | 14:27:03.743 | INFO    | Task run 'Populate VectorDatabase-0' - Adding the 0 entries
agent-1             | 14:27:03.745 | INFO    | Task run 'Populate VectorDatabase-0' - 2154/2155 | Dealing with PowerBI_638185018001242261
agent-1             | 14:27:03.750 | INFO    | Task run 'Populate VectorDatabase-0' - There are 0 entries from PowerBI_638185018001242261
agent-1             | 14:27:03.752 | INFO    | Task run 'Populate VectorDatabase-0' - Adding the 16 entries
agent-1             | {'error': [{'message': 'update vector: send POST request: Post "http://t2v-transformers:8080/vectors": dial tcp: lookup t2v-transformers on 127.0.0.11:53: server misbehaving'}]}
agent-1             | {'error': [{'message': 'update vector: send POST request: Post "http://t2v-transformers:8080/vectors": dial tcp: lookup t2v-transformers on 127.0.0.11:53: server misbehaving'}]}
agent-1             | {'error': [{'message': 'update vector: send POST request: Post "http://t2v-transformers:8080/vectors": dial tcp: lookup t2v-transformers on 127.0.0.11:53: server misbehaving'}]}
...
agent-1             | {'error': [{'message': 'update vector: send POST request: Post "http://t2v-transformers:8080/vectors": dial tcp: lookup t2v-transformers on 127.0.0.11:53: server misbehaving'}]}
agent-1             | {'error': [{'message': 'update vector: send POST request: Post "http://t2v-transformers:8080/vectors": dial tcp: lookup t2v-transformers on 127.0.0.11:53: server misbehaving'}]}
agent-1             | {'error': [{'message': 'update vector: send POST request: Post "http://t2v-transformers:8080/vectors": dial tcp: lookup t2v-transformers on 127.0.0.11:53: server misbehaving'}]}
agent-1             | 14:27:03.959 | INFO    | Task run 'Populate VectorDatabase-0' - Num elements in [OmdenaUngdcDocs]: {'Aggregate': {'OmdenaUngdcDocs': [{'meta': {'count': 2203}}]}}
agent-1             | 14:27:04.167 | INFO    | Task run 'Populate VectorDatabase-0' - Finished in state Completed()

LOCAL LOG

agent-1             | 15:29:37.537 | INFO    | Task run 'Populate VectorDatabase-0' - The last version of PowerBI_638184922783297887 has already been embedded and indexed
agent-1             | 15:29:37.538 | INFO    | Task run 'Populate VectorDatabase-0' - 2153/2155 | Dealing with PowerBI_638184627267335769
agent-1             | 15:29:37.541 | INFO    | Task run 'Populate VectorDatabase-0' - There are 0 entries from PowerBI_638184627267335769
agent-1             | 15:29:37.542 | INFO    | Task run 'Populate VectorDatabase-0' - Adding the 0 entries
agent-1             | 15:29:37.543 | INFO    | Task run 'Populate VectorDatabase-0' - 2154/2155 | Dealing with PowerBI_638185018001242261
agent-1             | 15:29:37.546 | INFO    | Task run 'Populate VectorDatabase-0' - There are 16 entries from PowerBI_638185018001242261
agent-1             | 15:29:37.547 | INFO    | Task run 'Populate VectorDatabase-0' - The last version of PowerBI_638185018001242261 has already been embedded and indexed
agent-1             | 15:29:37.551 | INFO    | Task run 'Populate VectorDatabase-0' - Num elements in [OmdenaUngdcDocs]: {'Aggregate': {'OmdenaUngdcDocs': [{'meta': {'count': 2812}}]}}
agent-1             | 15:29:37.863 | INFO    | Task run 'Populate VectorDatabase-0' - Finished in state Completed()

Valkea commented 7 months ago

Running the inference demo script on the remote server returns the following:

ubuntu@ip-172-x-x-x:~$ python3 inference_demo.py query="MY QUERY TEXT"
Traceback (most recent call last):
  File "/home/ubuntu/inference_demo.py", line 77, in <module>
    client = initialize_vectordb(collection_name)
  File "/home/ubuntu/inference_demo.py", line 19, in initialize_vectordb
    client = weaviate.Client(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/weaviate/client.py", line 150, in __init__
    self._connection = Connection(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/weaviate/connect/connection.py", line 171, in __init__
    self._server_version = self.get_meta()["version"]
  File "/home/ubuntu/.local/lib/python3.10/site-packages/weaviate/connect/connection.py", line 677, in get_meta
    res = _decode_json_response_dict(response, "Meta endpoint")
  File "/home/ubuntu/.local/lib/python3.10/site-packages/weaviate/util.py", line 798, in _decode_json_response_dict
    raise UnexpectedStatusCodeException(location, response)
weaviate.exceptions.UnexpectedStatusCodeException: Meta endpoint! Unexpected status code: 500, with response body: {'error': [{'message': 'send GET meta request: Get "http://t2v-transformers:8080/meta": dial tcp: lookup t2v-transformers on 127.0.0.11:53: server misbehaving'}]}.
sys:1: ResourceWarning: unclosed <socket.socket fd=3, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 35972), raddr=('127.0.0.1', 8080)>

Valkea commented 7 months ago

I reran the whole pipeline and it worked without any error... I also reran the inferance demo script and it worked without any error too.

The only change is that I entered into the Weaviate container with the docker exec -ti ID /bin/sh command then I installed cURL and checked that I could reach the t2v-transformers container with curl t2v-transformers:8080/meta (which was the case).

I need to terraform destroy and then terraform apply to see if this occurs again.

Valkea commented 7 months ago

Rerunning the whole process didn't trigger the same problem. It looks like a one-time problem that might be due to the capacity of the instance (which is currently running on a t2.small).

Valkea / Omdena_UN_GDC

dial tcp: lookup t2v-transformers on 127.0.0.11:53: server misbehaving #1