triton-inference-server / fastertransformer_backend

BSD 3-Clause "New" or "Revised" License
411 stars 133 forks source link

Can you shader data.json to run perf_analyzer? #39

Closed daemyung closed 2 years ago

daemyung commented 2 years ago

Description

I have written like below.


{
    "data":
    [
        {
            "input_ids" : [9915, 27221, 59, 77, 383, 1853, 3327, 1462],
            "input_lengths" : [8],
            "request_output_len" : [128],
            "beam_search_diversity_rate" : [0],
            "temperature" : [1.0],
            "len_penalty": [1.0],
            "repetition_penalty": [1.0],
            "random_seed": [0],
            "is_return_log_probs": [true],
            "beam_width": [1],
            "runtime_top_k": [1.0],
            "runtime_top_p": [1.0],
            "start_id": [0],
            "end_id": [1],
            "bad_words_list": [[0], [-1]],
            "stop_words_list": [[32094], [0]]
        }
    ]
}

but it returns error: failed to create concurrency manager: unable to find int32_t data in json. If you share this, It will be very helpful.

byshiue commented 2 years ago

You need to setup the shape like

  {
    "data" :
     [
        {
          "INPUT" :
                {
                    "content": [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
                    "shape": [2,8]
                }
        },
        {
          "INPUT" :
                {
                    "content": [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
                    "shape": [8,2]
                }
        },
        {
          "INPUT" :
                {
                    "content": [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
                }
        },
        {
          "INPUT" :
                {
                    "content": [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
                    "shape": [4,4]
                }
        }
        ...
      ]
  }

And perf_analyzer does not support optional input, you must pass all inputs. But the program will crash if you pass prompt_learning_task_ids but it cannot read the weight of prompting.

daemyung commented 2 years ago

Thanks for your help!