microsoft / DirectML

DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers, including all DirectX 12-capable GPUs from vendors such as AMD, Intel, NVIDIA, and Qualcomm.
MIT License
2.2k stars 293 forks source link

Operator GRU does not accept input buffer at index 2 #81

Open Ike-Dubaku opened 3 years ago

Ike-Dubaku commented 3 years ago

Here is the error I got when do input buffer binding for operator GRU:

D3D12 ERROR: : the dispatchable object expects nothing to be bound at index 2, but a binding of type DML_BINDING_TYPE_BUFFER was provided. Use binding type NONE to bind nothing to this slot. [ UNKNOWN ERROR #1: STRING_FROM_APPLICATION]

According to GRU description below, it need at least three inputs: data input, weight and recurrent. The BindInputs function seems accept tensor at index 0 (data input) and index 1 (weight), but not index 2 (recurrent). Are there any misunderstanding? Are there any sample code for GRU usage?

https://docs.microsoft.com/en-us/windows/win32/api/directml/ns-directml-dml_gru_operator_desc

adtsai commented 3 years ago

Hi, the documentation is correct here. The first three inputs of DML_OPERATOR_GRU are always required. We're not aware of any issues with this - could you provide some more details about how you're calling the API? What are the exact contents of the DML_GRU_OPERATOR_DESC struct that you're sending down to IDMLDevice::CreateOperator? And are you sure your binding table is referencing the correct compiled operator?

Ike-Dubaku commented 3 years ago

Belows are partial code in my project. Are there any code sample for GRU? If exists, I can also study it.

Here is code for create GRU operator by using DirectMLX API dml::GRU

    `{
        Dimensions modelInputSizes = { 1, 1, 1, GRU_IN };
        auto modelInput = dml::InputTensor(graph, 0, dml::TensorDesc(dataType, modelInputSizes, policy));

        auto gruWeight = dml::InputTensor(graph, 1, dml::TensorDesc(dataType, DML_TENSOR_FLAG_NONE, { 1, 1, 3 * GRU_OUT, GRU_IN }, policy));
        auto gruRecurrence = dml::InputTensor(graph, 2, dml::TensorDesc(dataType, DML_TENSOR_FLAG_NONE, { 1, 1, 3 * GRU_OUT, GRU_OUT }, policy));
        auto gru1 = dml::GRU(
            modelInput,
            gruWeight,
            gruRecurrence,
            std::nullopt,
            std::nullopt,
            std::nullopt,
            { dml::FusedActivation::Sigmoid(), dml::FusedActivation::Tanh() },
            DML_RECURRENT_NETWORK_DIRECTION_FORWARD,
            false,
            dml::GRUOutputOptions::Both);

        DML_EXECUTION_FLAGS executionFlags = DML_EXECUTION_FLAG_ALLOW_HALF_PRECISION_COMPUTATION;
        m_dmlGraph = graph.Compile(executionFlags, std::array<dml::Expression, 2>{ gru1.sequence, gru1.single });
    }

`

Here is the code for create the binding table

`DML_BINDING_PROPERTIES executeBindingProps = m_dmlGraph->GetBindingProperties();
tableDesc.Dispatchable = m_dmlGraph.Get();
tableDesc.SizeInDescriptors = executeBindingProps.RequiredDescriptorCount;
DX::ThrowIfFailed(m_dmlDevice->CreateBindingTable(&tableDesc, IID_PPV_ARGS(&m_dmlBindingTable)));

`

Here is the code call BindInputs. Error occurs if I change inputBindings[2] to non-empty binding.

`DML_BUFFER_BINDING bufferBindings[] =
{
    {}, // model input
    { m_modelGruWeight.Get(), 0, m_modelGruWeight->GetDesc().Width },
    { m_modelGruRecurrence.Get(), 0, m_modelGruRecurrence->GetDesc().Width },
};
bufferBindings[0] = DML_BUFFER_BINDING{ m_modelInput.Get() };
DML_BINDING_DESC inputBindings[] =
{
    { DML_BINDING_TYPE_BUFFER, &bufferBindings[0] }, // model input
    { DML_BINDING_TYPE_BUFFER, &bufferBindings[1] },
    { DML_BINDING_TYPE_NONE, nullptr },
};
m_dmlBindingTable->BindInputs(ARRAYSIZE(inputBindings), inputBindings);

`

Ike-Dubaku commented 3 years ago

B.T.W, for current implementation of GRU expression in DirectMLX.h, if the 'hiddenInit' parameter is std::nullopt, 'hiddenInitTensor.sizes[3]' will be not access-able in the function. Is it a bug?

inline GRUOutputs GRU( Expression input, Expression weight, Expression recurrence, Optional<Expression> bias, Optional<Expression> hiddenInit, Optional<Expression> sequenceLengths, Span<const FusedActivation> activationDescs, DML_RECURRENT_NETWORK_DIRECTION direction, bool linearBeforeReset, GRUOutputOptions outputOptions)

adtsai commented 3 years ago

Hi, we've looked into this a bit deeper and you're right - this appears to be a bug. Specifically, DML may incorrectly reject valid bindings in some cases when the RNN, LSTM, or GRU operators are used inside a DirectML graph (which is what DirectMLX is based on). We'll try to fix this for the next release of DirectML. In the meantime, one possible way to work around this issue is to call the GRU operator directly, i.e. as a standalone operator using DML_GRU_OPERATOR_DESC without going through DMLX.

And the uninitialized access of hiddenInit in DirectMLX.h does appear to be wrong - we'll need to fix that. Thanks for the bug report!