[Documentation] Running genai-directml-quantized models with metacommands disabled

jakubedzior commented 1 month ago

Describe the documentation issue

Is there currently any way to run models quantized by GenAI on DirectML with metacommands turned off?

Page / URL

No response

fdwr commented 4 weeks ago

I'm curious, is this for debugging or profiling purposes? That's not normally something one would want to disable. I know next to nothing about GenAI, but spelunking the ORT code, it appears there's a configuration option via ORT: https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/providers/dml/dml_provider_factory.cc#L437

jakubedzior commented 4 weeks ago

I'm curious, is this for debugging or profiling purposes? That's not normally something one would want to disable. I know next to nothing about GenAI, but spelunking the ORT code, it appears there's a configuration option via ORT: https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/providers/dml/dml_provider_factory.cc#L437

Hi, I know it can be done in ORT, but I need it in GenAI, let's say for debugging purposes 😊 I wish there was a search option to provide in the code (or genai_config.json) or even a workaround of some kind, to be able to access the DmlProvider ORT object and set it to disable metacommands. Perhaps there is another way, unfortunately I don't know much about GenAI either but the docs seem to be silent about it.

EDIT: Perhaps there is a way to access the disable metacommands option via genai_config.json provider options. I've tried to modify the following part:

..."provider_options": [
    {
        "dml": {}
    }
]...

to:

..."provider_options": [
    {
        "dml": {"disable_metacommands": true}
    }
]...

Unfortunately no luck there. But perhaps that's just an incorrect syntax! I'll try looking into that code snippet you provided, looks like maybe those ORT provider options are in fact what can we specify here, in GenAI, in genai_config.json

EDIT2: Having looked into it, I have no clue 😿 If that's indeed the solution, could someone provide the proper syntax?

microsoft / onnxruntime

[Documentation] Running genai-directml-quantized models with metacommands disabled #22466

Describe the documentation issue

Page / URL