mithril-security / blindai

Confidential AI deployment with secure enclaves :lock:
https://www.mithrilsecurity.io/
Apache License 2.0
502 stars 36 forks source link

[Question] How to use multiple inputs for model? #71

Closed failable closed 2 years ago

failable commented 2 years ago

Hi,

How can I upload a model with multiple inputs? The distilbert example does not use multiple inputs but it's quite normal with pre-trained models. What should I pass to dtype and shape in this case?

Thanks.

JoFrost commented 2 years ago

Hi there,

Unfortunately, the multiple inputs on models are not supported yet. We however planned to implement this feature soon.

dhuynh95 commented 2 years ago

Hi @liebkne !

As @JoFrost mentioned we do not support that yet. Could you give us more details about the model and the workflow you want to cover with BlindAI so we can hep you better?

Best :)

failable commented 2 years ago

Hi, thanks for quick response.

Models from the transformer package generally take multiple inputs like this one

import os

import torch
from transformers import AutoModelForMaskedLM, AutoTokenizer

model_name = "albert-base-v2"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForMaskedLM.from_pretrained(model_name)

text = "Paris is the [MASK] of France."
tokenizer_output = tokenizer(text, return_tensors="pt")

input_ids = tokenizer_output["input_ids"]
attention_mask = tokenizer_output["attention_mask"]
token_type_ids = tokenizer_output["token_type_ids"]

dynamic_axes = {
    0: "batch",
    1: "seq",
}

output_dir = "./albert"
os.makedirs(output_dir, exist_ok=True)
torch.onnx.export(
    model,
    (input_ids, attention_mask, token_type_ids),
    os.path.join(output_dir, "model.onnx"),
    input_names=["input_ids", "attention_mask", "token_type_ids"],
    output_names=["logits"],
    dynamic_axes={
        "input_ids": dynamic_axes,
        "attention_mask": dynamic_axes,
        "token_type_ids": dynamic_axes,
        "logits": dynamic_axes,
    },
    opset_version=13,
)

tokenizer.save_pretrained(output_dir)

In this case all three inputs input_ids, attention_mask and token_type_ids has the same dtypes and shapes. But of course there might be some models take inputs with different dtypes and shapes.

dhuynh95 commented 2 years ago

Ok I see. token_type_ids are more relevant to provide information when you train on some classification tasks like question answering. This information is not needed for inference.

Regarding attention_mask, if you want to predict on a single sentence, this information is not needed and it will provide the same output.

>>> from transformers import AlbertTokenizer, AlbertModel
>>> tokenizer = AlbertTokenizer.from_pretrained('albert-base-v2')
>>> model = AlbertModel.from_pretrained("albert-base-v2")
>>> text = "Replace me by any text you'd like."
>>> encoded_input = tokenizer(text, return_tensors='pt')

# Compute the output with both input ids and attention mask
>>> output_with_attention = model(**encoded_input)

# Compute with only input ids
>>> output_without_attention = model(input_ids=encoded_input["input_ids"])

>>> ((output_with_attention[0] -output_without_attention[0])**2).sum()
tensor(0., grad_fn=<SumBackward0>)

This code can help you test it.

I do agree though, if we want to batch sentences before sending them to BlindAI, this information is required. In that case we will have to see how we can add the ability for users to provide more complex input formats.

Could you tell me more the setup you have in mind? Do you require to batch sentences before sending them?

kbamponsem commented 2 years ago

Hi @liebkne, taking into consideration your request for the support of multiple inputs in BlindAI, in the current release, we have provided support for multiple inputs.

Now you can upload transformer models with multiple inputs and then likewise provide a list of lists of the input data to run the inference.

client.upload_model(`your_model.onnx`, tensor_inputs=[
              ([1,9], ModelDatumType.I64),
              ([1,9], ModelDatumType.I64),
              ([1,9], ModelDatumType.I64)
         ],
        tensor_outputs=ModelDatumType.F32 
)

client.run_model(
       [
            [20, 48, 51, 2001, 20, 5, 41, 45, 920],
            [1, 1, 1, 1, 1, 1, 1, 1, 1],
            [0, 0, 0, 0, 0, 0, 0, 0, 0]
       ]
 )

Do inform us if you still have questions about the added support.