microsoft / onnxruntime-extensions

onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime
MIT License
295 stars 80 forks source link

poor c# documentation, need help on implementing some py examples #740

Open MithrilMan opened 3 weeks ago

MithrilMan commented 3 weeks ago

Hello, I'm looking for an example, in C#, about how to use SentencePieceTokenizer.

I've seen there is a py example here https://github.com/microsoft/onnxruntime-extensions/blob/main/docs/custom_ops.md#sentencepiecetokenizer

But I haven't managed to find a proper way to convert that code into a c# implementation (I'm not even sure that py code is working to be honest)

example py code is this

url = "https://github.com/microsoft/ort-customops/raw/main/test/data/test_sentencepiece_ops_model__6.txt"
with urllib.request.urlopen(url) as f:
    content = f.read()
model = np.array(list(base64.decodebytes(content.encode())), dtype=np.uint8)

node = onnx.helper.make_node(
    'SentencepieceTokenizer',
    inputs=['inputs', 'nbest_size', 'alpha', 'add_bos', 'add_eos', 'reverse'],
    outputs=['indices', 'output'],
    mapping_file_name='vocabulary.txt',
    unmapping_value="unknown_word",
    model=model
)

inputs = np.array(["Hello world", "Hello world louder"], dtype=object),
nbest_size = np.array([0], dtype=np.float32),
alpha = np.array([0], dtype=np.float32),
add_bos = np.array([0], dtype=np.bool_),
add_eos = np.array([0], dtype=np.bool_),
reverse = np.array([0], dtype=np.bool_)

tokens = array([17486,  1017, 17486,  1017,   155, 21869], dtype=int32)
indices = array([0, 2, 6], dtype=int64)

expect(node, inputs=[inputs, nbest_size, alpha, add_bos, add_eos, reverse],
       outputs=[tokens, indices], name='sp')

How can this code be translated into C#?

Beside the fetching of the model, how does the onnx.helper.make_node translates to?

I tried with something like

string url = "https://github.com/microsoft/ort-customops/raw/main/test/data/test_sentencepiece_ops_model__6.txt";
byte[] model;

using (HttpClient client = new HttpClient())
{
   var response = await client.GetAsync(url);
   response.EnsureSuccessStatusCode();
   var content = await response.Content.ReadAsByteArrayAsync();
   model = Convert.FromBase64String(System.Text.Encoding.UTF8.GetString(content));
}

// Create the inputs
var inputs = new DenseTensor<string>(new[] { "Hello world", "Hello world louder" }, [2]);
var nbest_size = new DenseTensor<float>(new[] { 0.0f }, [1]);
var alpha = new DenseTensor<float>(new[] { 0.0f }, [1]);
var add_bos = new DenseTensor<bool>(new[] { false }, [1]);
var add_eos = new DenseTensor<bool>(new[] { false }, [1]);
var reverse = new DenseTensor<bool>(new[] { false }, [1]);

// Create the named inputs
var namedInputs = new NamedOnnxValue[]
{
            NamedOnnxValue.CreateFromTensor("inputs", inputs),
            NamedOnnxValue.CreateFromTensor("nbest_size", nbest_size),
            NamedOnnxValue.CreateFromTensor("alpha", alpha),
            NamedOnnxValue.CreateFromTensor("add_bos", add_bos),
            NamedOnnxValue.CreateFromTensor("add_eos", add_eos),
            NamedOnnxValue.CreateFromTensor("reverse", reverse)
};

using SessionOptions sessionOptions = new();
sessionOptions.RegisterOrtExtensions();// RegisterCustomOpLibraryV2(extensionsDllName, out handle);
sessionOptions.AppendExecutionProvider_CPU();

// Load the model
using var session = new InferenceSession(model, sessionOptions);
// Run inference
using var results = session.Run(namedInputs);
// Extract the results
var tokens = results.First(r => r.Name == "tokens").AsTensor<int>();
var indices = results.First(r => r.Name == "indices").AsTensor<long>();

Console.WriteLine("Tokens: " + string.Join(", ", tokens.ToArray()));
Console.WriteLine("Indices: " + string.Join(", ", indices.ToArray()));

but it fails with Microsoft.ML.OnnxRuntime.OnnxRuntimeException: '[ErrorCode:InvalidArgument] No graph was found in the protobuf.'

How am I supposed to load the SentencepieceTokenizer in C#? How am I supposed to create that node?

Thanks

P.S. If you help me sort it out, I can do a PR with working code for C# on some of the extensions

Craigacp commented 3 weeks ago

The Python code creates the ONNX node proto and then runs a test on that by wrapping it in a graph proto, model proto and then finally passing it into ORT. You'd need to write out a node proto in C# into an ONNX file and then load it in. Or you can write the tokenizer out in Python and use it from C#. ML.net has examples of writing ONNX files from C#.