quic / ai-hub-models

The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
https://aihub.qualcomm.com
BSD 3-Clause "New" or "Revised" License
488 stars 77 forks source link

qai-hub SDV2_1 demo running question #120

Open 1826133674 opened 6 days ago

1826133674 commented 6 days ago

Hello, I am trying to infer the quantized model SD2_1 using QNN 2.28 on a Samsung S24 phone。 I encountered an issue where the inference process using SDV1-5 gets stuck when inferring the Unet model released on qai-hub for SDV2_1.

The command I used is as follows:

cmd_exec_on_device = [PLATFORM_TOOLS_BIN_PATH + f'/adb', '-H', rh, '-s', device_id, 'shell', f'cd {target_device_dir} && ', f'export LD_LIBRARY_PATH={target_device_dir} &&', f' export ADSP_LIBRARY_PATH={target_device_dir} &&', f' {target_device_dir}/qnn-net-run ', f'--retrieve_context {model_context}', f' --backend {target_device_dir}/libQnnHtp.so', f' --input_list {target_device_dir}/input_list.txt', f' --output_dir {target_device_dir} ', f' --config_file {target_device_dir}/htp_backend_extensions.json ',

f' > {target_device_dir}/log.log'

                 ]

If I comment out the second to last line as follows, it can perform inference, but the performance data obtained from inferring the model at this point is not the best, and it will show a 'Context Free failure' error message.

cmd_exec_on_device = [PLATFORM_TOOLS_BIN_PATH + f'/adb', '-H', rh, '-s', device_id, 'shell', f'cd {target_device_dir} && ', f'export LD_LIBRARY_PATH={target_device_dir} &&', f' export ADSP_LIBRARY_PATH={target_device_dir} &&', f' {target_device_dir}/qnn-net-run ', f'--retrieve_context {model_context}', f' --backend {target_device_dir}/libQnnHtp.so', f' --input_list {target_device_dir}/input_list.txt', f' --output_dir {target_device_dir} ',

f' --config_file {target_device_dir}/htp_backend_extensions.json ',

                #  f' > {target_device_dir}/log.log' 
                 ]

What is causing this? Is there any solution to this problem?

the htp_backend_extensions.json is as follows: { "backend_extensions": { "shared_library_path": "libQnnHtpNetRunExtensions.so", "config_file_path": "htp_config.json" } }

the htp_config.json is as follows:

{

    "devices": [
        {
            "soc_id": 43,
            "dsp_arch": "v75",
            "cores":[{
                "core_id": 0,
                "perf_profile": "burst",
                "rpc_control_latency":100
            }]
        }
    ]
}

By the way , I can successfully infer the text_encoder model with the above configuration。

heydavid525 commented 5 days ago

Can you share the input_list.txt file? I can give it a try

1826133674 commented 4 days ago

Can you share the input_list.txt file? I can give it a try The input_list.txt is as follows:

/data/local/tmp/qnn_assets/QNN_binaries/inputs/input_0.raw /data/local/tmp/qnn_assets/QNN_binaries/inputs/input_1.raw /data/local/tmp/qnn_assets/QNN_binaries/inputs/input_2.raw 

For convenience, I created corresponding shape random data using numpy as input. The code to produce them is as follows:


    text_embedding = np.random.rand(1,77,1024).astype(np.float32)
    time_embedding = np.random.rand(1,1280).astype(np.float32)
    latent_in = np.random.rand(1,64,64,4).astype(np.float32)
    input_data_list = [latent_in , time_embedding,text_embedding]
    tmp_dirpath = os.path.abspath('tmp_aarch64/inputs')
    os.makedirs(tmp_dirpath, exist_ok=True)
    # Dump each input data from input_data_list as raw file and prepare input_list_filepath for 
     qnn-net-run
    input_list_text = ''
    for index, input_data in enumerate(input_data_list):
        raw_file_path = f'{tmp_dirpath}/input_{index}.raw'
        input_data.tofile(raw_file_path)
        input_list_text += target_device_dir + '/inputs/' + os.path.basename(raw_file_path) + ' '

    input_list_filepath = f'{tmp_dirpath}/../input_list.txt'
    with open(input_list_filepath, 'w') as f:
        f.write(input_list_text)