Open Matthieu-Tinycoaching opened 1 year ago
HI @Matthieu-Tinycoaching –
This seems as the same issue as semi-technologies/weaviate#8
Can you reach a single module directly? From what I can see it looks like your GPU isn't loading the model properly.
What does this result in (make sure to set the correct port)?
$ curl -XPOST -H 'Content-Type: application/json' http://localhost:8084/vectors/ -d'{"text": "Can I have a vector?"}'
PS: I you join our Slack channel more people might be able to help
Hi @bobvanluijt,
It seems to give the same error but not in the same step. For semi-technologies/weaviate#8 during the import and for semi-technologies/weaviate#9 during inference.
If I try the curl
command you gave to me with correct port:
curl -XPOST -H 'Content-Type: application/json' http://localhost:8080/vectors/ -d'{"text": "Can I have a vector?"}'
It gives me the following message:
{"code":404,"message":"path /vectors/ was not found"}
What is strange is that with weaviate console on localhost it works...
I will send this question on slack channel too.
So @Matthieu-Tinycoaching – Can you try one more thing.
$ docker ps
$ docker exec -it ID_OF_CONTAINER /bin/bash
$ curl -XPOST -H 'Content-Type: application/json' http://t2v-transformers-01-001/vectors/ -d'{"text": "Can I have a vector?"}'
(you might need to install curl).What this should do is get some info about the containers running the ML-models. I'm still thinking something is going wrong there.
Hi @bobvanluijt, I followed your test procedure and got the following error message:
curl: (7) Failed to connect to t2v-transformers port 80: Connection refused
It seems that on Ubuntu 18.0 LTS something is listening on port 80 blocking communication on this port (netstat -tlpn
):
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:39445 0.0.0.0:* LISTEN 17320/code
tcp 0 0 127.0.0.53:53 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:631 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:5432 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:5433 0.0.0.0:* LISTEN -
tcp6 0 0 :::80 :::* LISTEN -
tcp6 0 0 ::1:631 :::* LISTEN -
Wouldn't there be a way to change this port to another number?
Hi @bobvanluijt,
I tried with the appropriate port 8080: curl -XPOST -H 'Content-Type: application/json' http://t2v-transformers-01-001:8080/vectors/ -d'{"text": "Can I have a vector?"}'
and got the following error: {"error":"CUDA error: no kernel image is available for execution on the device\nCUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1."}
Running without discrete mode give the following warning:
t2v-transformers_1 | /usr/local/lib/python3.9/site-packages/torch/cuda/__init__.py:146: UserWarning:
t2v-transformers_1 | NVIDIA GeForce RTX 3090 with CUDA capability sm_86 is not compatible with the current PyTorch installation.
t2v-transformers_1 | The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
t2v-transformers_1 | If you want to use the NVIDIA GeForce RTX 3090 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/
t2v-transformers_1 |
t2v-transformers_1 | warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))
t2v-transformers_1 | INFO: Application startup complete.
t2v-transformers_1 | INFO: Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
When running the following request: curl -XPOST -H 'Content-Type: application/json' http://localhost:9000/vectors/ -d'{"text": "Can I have a vector?"}'
This gave the following error:
t2v-transformers_1 | ERROR: Something went wrong while vectorizing data.
t2v-transformers_1 | Traceback (most recent call last):
t2v-transformers_1 | File "/app/./app.py", line 51, in read_item
t2v-transformers_1 | vector = await vec.vectorize(item.text, item.config)
t2v-transformers_1 | File "/app/./vectorizer.py", line 70, in vectorize
t2v-transformers_1 | batch_results = self.get_batch_results(tokens, sentences[start_index:end_index])
t2v-transformers_1 | File "/app/./vectorizer.py", line 52, in get_batch_results
t2v-transformers_1 | return self.model_delegate.get_batch_results(tokens, text)
t2v-transformers_1 | File "/app/./vectorizer.py", line 94, in get_batch_results
t2v-transformers_1 | return self.model(**tokens)
t2v-transformers_1 | File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
t2v-transformers_1 | return forward_call(*input, **kwargs)
t2v-transformers_1 | File "/usr/local/lib/python3.9/site-packages/transformers/models/bert/modeling_bert.py", line 991, in forward
t2v-transformers_1 | extended_attention_mask: torch.Tensor = self.get_extended_attention_mask(attention_mask, input_shape)
t2v-transformers_1 | File "/usr/local/lib/python3.9/site-packages/transformers/modeling_utils.py", line 839, in get_extended_attention_mask
t2v-transformers_1 | extended_attention_mask = extended_attention_mask.to(dtype=self.dtype) # fp16 compatibility
t2v-transformers_1 | RuntimeError: CUDA error: no kernel image is available for execution on the device
t2v-transformers_1 | CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
t2v-transformers_1 | For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
t2v-transformers_1 | INFO: 192.168.96.1:34934 - "POST /vectors/ HTTP/1.1" 500 Internal Server Error
This is linked to this issue: https://github.com/semi-technologies/t2v-transformers-models/issues/35
Hi,
I followed the online tutorial: https://weaviate.io/developers/weaviate/current/tutorials/semantic-search-through-wikipedia.html#3-step-tutorial
When searching within the weaviate console:
I got the following answer:
But, while trying to search the same query through weaviate python client:
I got the following error message:
{'data': {'Get': {'Paragraph': None}}, 'errors': [{'locations': [{'column': 6, 'line': 1}], 'message': 'explorer: get class: vectorize params: vectorize params: vectorize params: vectorize keywords: remote client vectorize: fail with status 500: CUDA error: no kernel image is available for execution on the device\nCUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1.', 'path': ['Get', 'Paragraph']}]}
Would you have any idea?