zama-ai / concrete-ml

Concrete ML: Privacy Preserving ML framework built on top of Concrete, with bindings to traditional ML frameworks.
Other
851 stars 122 forks source link

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! #760

Open nazarpysko opened 1 week ago

nazarpysko commented 1 week ago

Summary

I'm new to FHE and was trying out different tutorials listed in zama.ai website. More specifically, while executing the Quantization Aware Training notebook, I got this error in the eleventh code-cell:

# Compile the model using a representative input-set
quantized_numpy_module = compile_brevitas_qat_model(torch_model, X_train)

The full error traceback looks like this:

RuntimeError                              Traceback (most recent call last)
Cell In[17], line 2
      # Compile the model using a representative input-set
----> quantized_numpy_module = compile_brevitas_qat_model(torch_model, X_train)

File ~/anaconda3/envs/test-concreteml/lib/python3.8/site-packages/concrete/ml/torch/compile.py:492, in compile_brevitas_qat_model(torch_model, torch_inputset, n_bits, configuration, artifacts, show_mlir, rounding_threshold_bits, p_error, global_p_error, output_onnx_file, verbose, inputs_encryption_status, reduce_sum_copy)
    # Here we add a "eliminate_nop_pad" optimization step for onnxoptimizer
    # https://github.com/onnx/optimizer/blob/master/onnxoptimizer/passes/eliminate_nop_pad.h#L5
    # It deletes 0-values padding.
    # In the export function, the `args` parameter is used instead of the `input_shape` one in
    # order to be able to handle multi-inputs models
    exporter.onnx_passes += [
        "eliminate_nop_pad",
        "fuse_pad_into_conv",
        "fuse_matmul_add_bias_into_gemm",
    ]
--> [492] onnx_model = exporter.export(
    [493]     torch_model,
    [494]     args=dummy_input_for_tracing,
    [495]     export_path=str(output_onnx_file_path),
    [496]     keep_initializers_as_inputs=False,
    [497]     opset_version=OPSET_VERSION_FOR_ONNX_EXPORT,
    [498] )
...
---> [70] y = x / scale
    [71] y = y + zero_point
    [72] min_int_val = self.min_int(bit_width)

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

Environment description

OS: Ubuntu (WSL 2) Python version: 3.8.19 concrete-ml version: 1.5.0

RomanBredehoft commented 1 week ago

Hello @nazarpysko , This looks more like a pytorch issue 🤔 Do you confirm :

By the way, I invite you to check out and use our latest version, Concrete ML 1.6.0, released just a few days ago 😉

nazarpysko commented 1 week ago

I can confirm that:

Sure! I will try the latest version of concrete-ml.

nazarpysko commented 1 week ago

I also tried to move from a conda env to a python venv, but got the same error. I used python 3.10, torch 1.13.1 and concrete-ml 1.6.0.

RomanBredehoft commented 1 week ago

Hello again @nazarpysko, Looks like there is something going on with the cpu/gpu then. Could you try to remove the device = "cuda" if torch.cuda.is_available() else "cpu" line (cell 7 I think) and write device = "cpu" instead ? Or at least send your model to cpu by doing `torch_model = torch_model.to_device("cpu") after training !

jfrery commented 1 week ago

Hi @nazarpysko,

You are right there is a problem when the machine has a gpu available since the data will be on cpu and the model on gpu.

You can add a

torch_model = torch_model.cpu()

right before calling compile_brevitas_qat_model and it should work fine. We are fixing the notebook in main (https://github.com/zama-ai/concrete-ml/pull/767).

Thanks for the issue!