Error: CUDA error: no kernel image is available for execution on the device with Weaviate python client only

Matthieu-Tinycoaching commented 1 year ago

Hi,

I followed the online tutorial: https://weaviate.io/developers/weaviate/current/tutorials/semantic-search-through-wikipedia.html#3-step-tutorial

When searching within the weaviate console:

##
# Generic question about Italian food
##
{
  Get {
    Paragraph(
      nearText: {
        concepts: ["Italian food"]
      }
      limit: 50
    ) {
      content
      order
      title
      inArticle {
        ... on Article {
          title
        }
      }
    }
  }
}

I got the following answer:

{
  "data": {
    "Get": {
      "Paragraph": [
        {
          "content": "Italian cuisine has a great variety of different ingredients which are commonly used, ranging from fruits, vegetables, grains, cheeses, meats and fish. In the North of Italy, fish (such as cod, or baccalà), potatoes, rice, corn (maize), sausages, pork, and different types of cheese are the most common ingredients. Pasta dishes with tomato are common throughout Italy. Italians use ingredients that are fresh and subtly seasoned and spiced. In Northern Italy though there are many kinds of stuffed pasta,  and  are equally popular if not more so. Ligurian ingredients include several types of fish and seafood dishes. Basil (found in pesto), nuts, and olive oil are very common. In Emilia-Romagna, common ingredients include ham (prosciutto), sausage (cotechino), different sorts of salami, truffles, grana, Parmigiano-Reggiano, tomatoes (Bolognese sauce or ragù) and aceto balsamico. Traditional Central Italian cuisine uses ingredients such as tomatoes, all kinds of meat, fish, and pecorino. In Tuscany, pasta (especially pappardelle) is traditionally served with meat sauce (including game meat). In Southern Italy, tomatoes (fresh or cooked into tomato sauce), peppers, olives and olive oil, garlic, artichokes, oranges, ricotta cheese, eggplants, zucchini, certain types of fish (anchovies, sardines and tuna), and capers are important components to the local cuisine. Cheeses and dairy products are foods of which Italy has a great diversity of existing types. The varieties of Italian cheeses and dairy products are very numerous; there are more than 600 distinct types throughout the country, of which 490 are protected and marked as PDO (Protected designation of origin), PGI (Protected Geographical Indication) and PAT (Prodotto agroalimentare tradizionale). Olive oil is the most commonly used vegetable fat in Italian cooking, and as the basis for sauces, replaced only in some recipes and in some geographical areas by butter or lard. Italy is the largest consumer of olive oil, with 30% of the world total; it also has the largest range of olive cultivars in existence and is the second largest producer and exporter in the world. Bread has always been, as it has for other Mediterranean countries, a fundamental food in Italian cuisine. There are numerous regional types of bread. Italian cuisine has a great variety of sausages and cured meats, many of which are protected and marked as PDO and PGI, and make up 34% of the total of sausages and cured meats consumed in Europe, while others are marked as PAT. Meat, especially beef, pork and poultry, is very present in Italian cuisine, in a very wide range of preparations and recipes. It is also important as an ingredient in the preparation of sauces for pasta. In addition to the varieties mentioned, albeit less commonly, sheep, goat, horse, rabbit and, even less commonly, game meat are also consumed in Italy. Since Italy is largely surrounded by the sea, therefore having a great coastal development and being rich in lakes, fish (both marine and freshwater), as well as crustaceans, molluscs and other seafood, enjoy a prominent place in Italian cuisine, as in general in the Mediterranean cuisine. Fish is the second course in meals and is also an ingredient in the preparation of seasonings for types of pasta. It is also widely used in appetizers. Italian cuisine is also well known (and well regarded) for its use of a diverse variety of pasta. Pasta include noodles in various lengths, widths, and shapes. Most pastas may be distinguished by the shapes for which they are named—penne, maccheroni, spaghetti, linguine, fusilli, lasagne, and many more varieties that are filled with other ingredients like ravioli and tortellini. The word pasta is also used to refer to dishes in which pasta products are a primary ingredient. It is usually served with sauce. There are hundreds of different shapes of pasta with at least locally recognized names. Examples include spaghetti (thin rods), rigatoni (tubes or cylinders), fusilli (swirls), and lasagne (sheets). Dumplings, like gnocchi (made with potatoes or pumpkin) and noodles like spätzle, are sometimes considered pasta. Pasta is divided into two broad categories: dry pasta (100% durum wheat flour mixed with water) and fresh pasta (also with soft wheat flour and almost always mixed with eggs). Pasta is generally cooked by boiling. Under Italian law, dry pasta (pasta secca) can only be made from durum wheat flour or durum wheat semolina, and is more commonly used in Southern Italy compared to their Northern counterparts, who traditionally prefer the fresh egg variety. Durum flour and durum semolina have a yellow tinge in colour. Italian pasta is traditionally cooked  (English: firm to the bite, meaning not too soft). There are many types of wheat flour with varying gluten and protein levels depending on the variety of grain used. Particular varieties of pasta may also use other grains and milling methods to make the flour, as specified by law. Some pasta varieties, such as pizzoccheri, are made from buckwheat flour. Fresh pasta may include eggs (, \"egg pasta\"). Both dry and fresh pasta are used to prepare the dish, in three different ways: : pasta is cooked and then served with a sauce or other condiment; minestrone: pasta is cooked and served in meat or vegetable broth, even together with chopped vegetables; pasta al forno: the pasta is first cooked and seasoned, and then passed back to the oven. Pizza, consisting of a usually round, flat base of leavened wheat-based dough topped with tomatoes, cheese, and often various other ingredients (such as anchovies, mushrooms, onions, olives, meats, and more), which is then baked at a high temperature, traditionally in a wood-fired oven, is the best known and most consumed Italian food in the world. In 2009, upon Italy's request, Neapolitan pizza was registered with the European Union as a Traditional Speciality Guaranteed dish,Official Journal of the European Union, Commission regulation (EU) No 97/2010 , 5 February 2010International Trademark Association, European Union: Pizza napoletana obtains \"Traditional Speciality Guaranteed\" status , 1 April 2010 and in 2017 the art of its making was included on UNESCO's list of intangible cultural heritage. In Italy it is consumed as a single dish () or as a snack, even on the go (pizza al taglio). In the various regions, dishes similar to pizza are the various types of focaccia, such as piadina, crescia or sfincione. ",
          "inArticle": [
            {
              "title": "Italian cuisine"
            }
          ],
          "order": 7,
          "title": "Basic foods"
        }, ...
}

But, while trying to search the same query through weaviate python client:

import weaviate

client = weaviate.Client("http://localhost:8080")

nearText = {
    "concepts": ["Italian food"]
}

search = client.query\
    .get("Paragraph", ["content", "order", "title", "_additional {certainty} "])\
    .with_near_text(nearText)\
    .with_limit(50)\
    .do()

print(search)

I got the following error message: {'data': {'Get': {'Paragraph': None}}, 'errors': [{'locations': [{'column': 6, 'line': 1}], 'message': 'explorer: get class: vectorize params: vectorize params: vectorize params: vectorize keywords: remote client vectorize: fail with status 500: CUDA error: no kernel image is available for execution on the device\nCUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1.', 'path': ['Get', 'Paragraph']}]}

Would you have any idea?

bobvanluijt commented 1 year ago

HI @Matthieu-Tinycoaching –

This seems as the same issue as semi-technologies/weaviate#8

Can you reach a single module directly? From what I can see it looks like your GPU isn't loading the model properly.

What does this result in (make sure to set the correct port)?

$ curl -XPOST -H 'Content-Type: application/json' http://localhost:8084/vectors/ -d'{"text": "Can I have a vector?"}'

PS: I you join our Slack channel more people might be able to help

Matthieu-Tinycoaching commented 1 year ago

Hi @bobvanluijt,

It seems to give the same error but not in the same step. For semi-technologies/weaviate#8 during the import and for semi-technologies/weaviate#9 during inference.

If I try the curl command you gave to me with correct port: curl -XPOST -H 'Content-Type: application/json' http://localhost:8080/vectors/ -d'{"text": "Can I have a vector?"}'

It gives me the following message: {"code":404,"message":"path /vectors/ was not found"}

What is strange is that with weaviate console on localhost it works...

I will send this question on slack channel too.

bobvanluijt commented 1 year ago

So @Matthieu-Tinycoaching – Can you try one more thing.

Find the id of the docker container by running $ docker ps
Log in into the container Weaviate $ docker exec -it ID_OF_CONTAINER /bin/bash
Use this exact curl command: $ curl -XPOST -H 'Content-Type: application/json' http://t2v-transformers-01-001/vectors/ -d'{"text": "Can I have a vector?"}' (you might need to install curl).

What this should do is get some info about the containers running the ML-models. I'm still thinking something is going wrong there.

Matthieu-Tinycoaching commented 1 year ago

Hi @bobvanluijt, I followed your test procedure and got the following error message: curl: (7) Failed to connect to t2v-transformers port 80: Connection refused

It seems that on Ubuntu 18.0 LTS something is listening on port 80 blocking communication on this port (netstat -tlpn):

(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 127.0.0.1:39445         0.0.0.0:*               LISTEN      17320/code          
tcp        0      0 127.0.0.53:53           0.0.0.0:*               LISTEN      -                   
tcp        0      0 127.0.0.1:631           0.0.0.0:*               LISTEN      -                   
tcp        0      0 127.0.0.1:5432          0.0.0.0:*               LISTEN      -                   
tcp        0      0 127.0.0.1:5433          0.0.0.0:*               LISTEN      -                   
tcp6       0      0 :::80                   :::*                    LISTEN      -                   
tcp6       0      0 ::1:631                 :::*                    LISTEN      -

Wouldn't there be a way to change this port to another number?

Matthieu-Tinycoaching commented 1 year ago

Hi @bobvanluijt, I tried with the appropriate port 8080: curl -XPOST -H 'Content-Type: application/json' http://t2v-transformers-01-001:8080/vectors/ -d'{"text": "Can I have a vector?"}'

and got the following error: {"error":"CUDA error: no kernel image is available for execution on the device\nCUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1."}

Matthieu-Tinycoaching commented 1 year ago

Running without discrete mode give the following warning:

t2v-transformers_1  | /usr/local/lib/python3.9/site-packages/torch/cuda/__init__.py:146: UserWarning: 
t2v-transformers_1  | NVIDIA GeForce RTX 3090 with CUDA capability sm_86 is not compatible with the current PyTorch installation.
t2v-transformers_1  | The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
t2v-transformers_1  | If you want to use the NVIDIA GeForce RTX 3090 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/
t2v-transformers_1  | 
t2v-transformers_1  |   warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))
t2v-transformers_1  | INFO:     Application startup complete.
t2v-transformers_1  | INFO:     Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)

When running the following request: curl -XPOST -H 'Content-Type: application/json' http://localhost:9000/vectors/ -d'{"text": "Can I have a vector?"}' This gave the following error:

t2v-transformers_1  | ERROR:    Something went wrong while vectorizing data.
t2v-transformers_1  | Traceback (most recent call last):
t2v-transformers_1  |   File "/app/./app.py", line 51, in read_item
t2v-transformers_1  |     vector = await vec.vectorize(item.text, item.config)
t2v-transformers_1  |   File "/app/./vectorizer.py", line 70, in vectorize
t2v-transformers_1  |     batch_results = self.get_batch_results(tokens, sentences[start_index:end_index])
t2v-transformers_1  |   File "/app/./vectorizer.py", line 52, in get_batch_results
t2v-transformers_1  |     return self.model_delegate.get_batch_results(tokens, text)
t2v-transformers_1  |   File "/app/./vectorizer.py", line 94, in get_batch_results
t2v-transformers_1  |     return self.model(**tokens)
t2v-transformers_1  |   File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
t2v-transformers_1  |     return forward_call(*input, **kwargs)
t2v-transformers_1  |   File "/usr/local/lib/python3.9/site-packages/transformers/models/bert/modeling_bert.py", line 991, in forward
t2v-transformers_1  |     extended_attention_mask: torch.Tensor = self.get_extended_attention_mask(attention_mask, input_shape)
t2v-transformers_1  |   File "/usr/local/lib/python3.9/site-packages/transformers/modeling_utils.py", line 839, in get_extended_attention_mask
t2v-transformers_1  |     extended_attention_mask = extended_attention_mask.to(dtype=self.dtype)  # fp16 compatibility
t2v-transformers_1  | RuntimeError: CUDA error: no kernel image is available for execution on the device
t2v-transformers_1  | CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
t2v-transformers_1  | For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
t2v-transformers_1  | INFO:     192.168.96.1:34934 - "POST /vectors/ HTTP/1.1" 500 Internal Server Error

This is linked to this issue: https://github.com/semi-technologies/t2v-transformers-models/issues/35

weaviate / semantic-search-through-wikipedia-with-weaviate

Error: CUDA error: no kernel image is available for execution on the device with Weaviate python client only #9