databricks-industry-solutions / mfg-llm-qa-bot

Other
15 stars 15 forks source link

RuntimeError on 04_Assemble_App (Expected all tensors to be on the same device) #16

Closed agmpt closed 11 months ago

agmpt commented 11 months ago

Runing 02_Define_Basic_Searchon Azure, when calling a model to predict any search I get a RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!

agmpt commented 11 months ago

Problem solved by selecting the Standard_NC6e_v3 compute which has a single GPU. I can now run successfully to the notebook 04_Assemble_App but on notebook 05_Deploy_Model I get the error:

{"name": "mfg-llm-qabot-serving-endpoint", "config": {"served_models": [{"model_name": "mfg-llm-qabot", "model_version": "4", "workload_type": "GPU_MEDIUM", "workload_size": "Small", "scale_to_zero_enabled": "false"}]}} {'error_code': 'INVALID_PARAMETER_VALUE', 'message': "Workload size 'Small' is not supported. Please choose a node type from "}

I think the problem is that GPU model serving is not yet available on azure databricks, but I'll try to talk with my Databricks account team.