XiaoMi / nnlib

Fork of https://source.codeaurora.org/quic/hexagon_nn/nnlib
BSD 3-Clause Clear License
54 stars 11 forks source link

How to use hexagon_nn with unsigned PD? #9

Closed apivovarov closed 4 years ago

apivovarov commented 4 years ago

I apologies for posting my question as an issue.

I'm trying to run Qualcomm/Hexagon_SDK/3.4.3/examples/hexagon_nn/tutorials on Xiaomi Mi9 phone SM8150 (SDM855).

Example: 001-nop.c (using libs/hexagon_nn/2.6)

I added #pragma weak remote_session_control and hexnn_controller_request_unsigned_pd() to 001-nop.c.

Tutorial execution in examples/hexagon_nn/tutorials shows:

python tutorials_walkthrough.py -T sm8150 -N
...
---- Run Examples on cDSP ----
---- Runing 001-nop     ----
adb wait-for-device push /root/3.4.3/libs/hexagon_nn/2.6/hexagon_Debug_dynamic_toolv83_v66/ship/libhexagon_nn_skel.so /data/local/tmp/vendor/lib/rfsa/adsp/
/root/3.4.3/libs/hexagon_nn/2.6/hexagon_Debug_dynamic_toolv83_v66/ship/libhexagon_nn_skel.so: 1 file pushed. 1.4 MB/s (5007704 bytes in 3.443s)
adb wait-for-device push /root/3.4.3/examples/hexagon_nn/tutorials/android_Debug_aarch64/ship/001-nop /data/local/tmp/vendor/bin
/root/3.4.3/examples/hexagon_nn/tutorials/android_Debug_aarch64/ship/001-nop: 1 file pushed. 0.6 MB/s (118624 bytes in 0.176s)
adb wait-for-device shell chmod 777 /data/local/tmp/vendor/bin/001-nop
adb wait-for-device shell ADSP_LIBRARY_PATH="/data/local/tmp/vendor/lib/rfsa/adsp" /data/local/tmp/vendor/bin/001-nop
***************** remote_session_control is TRUE ****************
***************** remote_session_control returned 0 ****************
fastrpc_setup Done
Trying to hexagon_nn_config
hexagon_nn_config Done
hexagon_nn_init Done
hexagon_nn_append_node Done
hexagon_nn_prepare Done
Trying hexagon_nn_execute
Whoops... run failed: -1
Test Failed, err=-1

logcat shows that remote_handle_open for libhexagon_nn_skel.so was successfull, but remote_handle_invoke failed

12-21 23:50:18.501  3565  3565 V /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/fastrpc_apps_user.c:1724: Successfully opened fastrpc_shell_unsigned_3
12-21 23:50:18.521  3565  3565 V /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/fastrpc_apps_user.c:1870: Successfully created user PD on domain 3 (attrs 0x8)
12-21 23:50:18.545  3565  3565 V /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/fastrpc_apps_user.c:1006: remote_handle_open: Successfully opened handle 0xed2620 for hexagon_nn on domain 3
12-21 23:50:18.557  3565  3565 D /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/fastrpc_apps_user.c:925: Error 0xffffffff: remote_handle_invoke failed for handle 0xed2620, method 12 on domain 3 (sc 0xc020200)
12-21 23:50:18.557  3565  3566 D /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/fastrpc_apps_user.c:925: Error 0xffffffff: remote_handle_invoke failed for handle 0x3, method 4 on domain 3 (sc 0x4020200)
12-21 23:50:18.558  3565  3566 E /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/listener_android.c:244:listener protocol failure ffffffff
12-21 23:50:18.558  3565  3565 D /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/fastrpc_apps_user.c:925: Error 0x27: remote_handle_invoke failed for handle 0xed2620, method 13 on domain 3 (sc 0xd010000)
12-21 23:50:18.558  3565  3566 D /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/fastrpc_apps_user.c:925: Error 0x27: remote_handle_invoke failed for handle 0x3, method 4 on domain 3 (sc 0x4020200)
12-21 23:50:18.558  3565  3566 E /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/listener_android.c:251::error: 39: 0 == (nErr = __QAIC_HEADER(adsp_listener_next2)( ctx, nErr, 0, 0, &ctx, &handle, &sc, inBufs, inBufsLen, &inBufsLenReq))
12-21 23:50:18.558  3565  3566 E /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/listener_android.c:333:Error 0x27: listener2 thread exited

Full example code 001-nop.c:

#include "remote.h"

#pragma weak remote_session_control

// hexagon_nn.h includes most of the things you'll need to create and run graphs.
// Its most important includes are nn_graph.h, which includes nn_graph_if.h.
// Together, these provide the data-types for input/output tensors,
//   and the API you'll use for initializing, building, preparing and running
//   your graphs.
// NOTE: hexagon_nn.h redefines malloc(), alloc(), etc. so
//   they become a compile-error "OOPS MALLOC".  This is because you should
//   use rpcmem_alloc() instead.
#include <hexagon_nn.h>

// hexagon_nn_ops.h defines the various graph operations (e.g. "MatMul", "NOP"
//   and "Relu") which you can do.  Internally, it just expands
//   interface/ops.def into a usable format.  ops.def contains the list of
//   all implemented ops.
#include "hexagon_nn_ops.h"

// For printf, etc.
#include <stdio.h>

// If you're already familiar with SDK programming for the DSP,
//   you've probably used fastRPC.  There's already lots of examples
//   documenting its use, and the purpose of this tutorial is to
//   expose the hexagon_nn_* API, so we'll ignore the fastRPC details.
// For these tutorials, we create a couple functions
//   fastrpc_setup() and fastrpc_teardown(), and some required includes.
// FastRPC allows our code running on the ARM to call functions located
//   on the DSP, quite seamlessly.
// To enable this ARM/DSP communication, we need to open a channel.
//   We'll also need to be careful later how we call functions that cross
//   the ARM/DSP partition, e.g. sending pointers, to ensure the ARM and
//   DSP see the same data.
#include "sdk_fastrpc.h"

// The structure of our NOP network looks like this.
//   It's really just a NOP floating in space, with no inputs or outputs.
//
//
//                   ==============
//    ?????????      ||    NOP   ||      ?????????
//   ??nothing??     || id=0x1000||     ??nothing??
//    ?????????      ==============      ?????????
//

int hexnn_controller_request_unsigned_pd() {
  int ret = -1;
  if (remote_session_control) {
    printf("***************** remote_session_control is TRUE ****************\n");
    struct remote_rpc_control_unsigned_module data;
    data.enable = 1;
    data.domain = CDSP_DOMAIN_ID;
    ret = remote_session_control(DSPRPC_CONTROL_UNSIGNED_MODULE, (void *) &data, sizeof(data));
    printf("***************** remote_session_control returned %d ****************\n", ret);
  } else {
    printf("***************** remote_session_control is FALSE ****************\n");
  }
  return ret;
}

int main(int argc, char **argv) {
        int err = 0;
        hexnn_controller_request_unsigned_pd();
        // Start the ARM/DSP communications channel so we can call
        //   library functions that execute on the dsp.
        if (fastrpc_setup() != 0) return 1;
        printf("fastrpc_setup Done\n");
        // The nnlib API consists of functions that begin "hexagon_nn_*"
        // This prefix indicates that the function will actually run on the DSP.
        // To run a neural network we'll use this basic API:
        //   1) hexagon_nn_config            - Start nnlib, preparing globals
        //   2) hexagon_nn_init              - Initialize a new graph
        //   3) hexagon_nn_set_debug_level   - Enable debug
        //   4) hexagon_nn_append_node       - Add nodes to the graph
        //   5) hexagon_nn_append_const_node - Add constants (pure data, not ops)
        //                                     (we won't need any for now)
        //   6) hexagon_nn_prepare           - Allocate memory, strategize,
        //                                     and optimize the graph for speed
        //   7) hexagon_nn_execute           - Run an inference
        //   8) hexagon_nn_teardown          - Destroy the graph, free resources

        // Ensures that nnlib is ready to start working.
        printf("Trying to hexagon_nn_config\n");
        hexagon_nn_config();
        printf("hexagon_nn_config Done\n");

        // Initialize a fresh, empty graph.  Return a graph-handle by reference.
        hexagon_nn_nn_id graph_id;
        if (hexagon_nn_init(&graph_id)) {
                printf("Whoops... Cannot init\n");
                return 2;
        }
        printf("hexagon_nn_init Done\n");

        // Set power level (to max/turbo)
        if ((err = hexagon_nn_set_powersave_level(0)) != 0) {
                printf("Whoops... Cannot set power level: %d\n", err);
                goto TEARDOWN;
        }

        // Select our debug level.  0=none, >4=max
        // When creating new graphs, it's nice to have max debug
        //   even if you don't think you need it.
        //hexagon_nn_set_debug_level(graph_id, 100);

        // Append a node to the graph.
        // We need to provide a unique-id so other nodes can connect.
        // The operation can be any of the ops found in interface/ops.def,
        //   prefixed with "OP_" (e.g. OP_MatMul_f, OP_Relu_f, OP_MaxPool_f)
        // Our NOP node doesn't need any padding, because it won't do anything.
        // Our input/output lists will be NULL in this example,
        //   but for real graphs we'll need to connect nodes using these lists.
        hexagon_nn_append_node(
                graph_id,           // Graph handle we're appending into
                0x1000,             // Node identifier (any unique uint32)
                OP_Nop,             // Operation of this node (e.g. Concat, Relu)
                NN_PAD_NA,          // Padding type for this node
                NULL,               // The list of inputs to this node
                0,                  //   How many elements in input list?
                NULL,               // The list of outputs from this node
                0                   //   How many elements in output list?
                );
        printf("hexagon_nn_append_node Done\n");
        // Prepare the graph for execution by optimizing it, allocating storage,
        //   connecting all the input/output pointers between nodes, and
        //   doing some basic checks, like number of input/output tensors and
        //   sizing for each node.
        if (hexagon_nn_prepare(graph_id)) {
                printf("Whoops... Cannot prepare\n");
        }
        printf("hexagon_nn_prepare Done\n");

        // Execute an inference on our input data.
        // Real graphs require input and output buffers, but we'll
        //   just use zero-size NULL pointers for this NOP example.
        uint32_t out_batches, out_height, out_width, out_depth, out_data_size;
        printf("Trying hexagon_nn_execute\n");
        if ((err = hexagon_nn_execute(
                     graph_id,
                     0, 0, 0, 0,             // Our input has 0-dimension
                     NULL,                   // Pointer to input data
                     0,                      // How many total bytes of input?
                     (unsigned int *) &out_batches,
                     (unsigned int *) &out_height,
                     (unsigned int *) &out_width,
                     (unsigned int *) &out_depth,
                     (uint8_t *)NULL,        // Pointer to output buffer
                     0,                      // Max size of output buffer
                     (unsigned int*) &out_data_size)         // Actual size used for output
                    ) != 0) {

                printf("Whoops... run failed: %d\n",err);
        }

TEARDOWN:
    // Free the memory, especially if we want to build subsequent graphs
    hexagon_nn_teardown(graph_id);

    // Stop fastRPC
    fastrpc_teardown();

    if (!err) printf("Test Passed!\n");
    else printf ("Test Failed, err=%d\n", err);

    return err;
}
lee-bin commented 4 years ago

Please see https://github.com/XiaoMi/mace/issues/581#issuecomment-568331538