Open JoAnn0812 opened 3 months ago
How can I resolve this issue? thanks
You should be careful here
File "builder.py", line 102, in attention_layer_opt
Wall = init_dict[prefix + WQKV]
KeyError: 'l1_attention_self_qkv_kernel'
Thanks for reply, I am new to this TensorRT things. Can you guide on how to modify the builder.py script?
You can single step debugging.
@JoAnn0812 where did you download the checkpoint? If you use are using custom checkpoint instead of the official one in this repo README, you need change the weights mapping function load_xxx
in https://github.com/NVIDIA/TensorRT/blob/release/10.2/demo/BERT/builder_utils.py
@JoAnn0812 where did you download the checkpoint? If you use are using custom checkpoint instead of the official one in this repo README, you need change the weights mapping function
load_xxx
in https://github.com/NVIDIA/TensorRT/blob/release/10.2/demo/BERT/builder_utils.py
I am using the official one in README 'bash scripts/download_model.sh', I even download with 'ngc registry model download-version "nvidia/bert_tf_ckpt_large_qa_squad2_amp_128:19.03.1"' but it still having the same issue. I just want to reproduce the bert benchmarking without any change to the script. Can you share with me other way to download a workable checkpoint? Thank you
@JoAnn0812 Hi, did you able to run the demo in your jetson device?
I tried to run model Bert on Jetson, Ampere GPU for evaluating PTQ (post-training quantization) Int8 accuracy using SQuAD dataset , but it fails with the error below during building the engine:
WARNING:tensorflow:From /home/ecnd/TensorRT/demo/BERT/bert_test_env/lib/python3.8/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version. Instructions for updating: non-resource variables are not supported in the long term [08/05/2024-00:36:17] [TRT] [I] Using configuration file: models/fine-tuned/bert_tf_ckpt_large_qa_squad2_amp_128_v19.03.1/bert_config.json [08/05/2024-00:36:17] [TRT] [I] Found 394 entries in weight map [08/05/2024-00:36:22] [TRT] [E] Could not convert non-contiguous NumPy array to Weights. Please use numpy.ascontiguousarray() to fix this. [08/05/2024-00:36:23] [TRT] [I] [MemUsageChange] Init CUDA: CPU +215, GPU +0, now: CPU 2813, GPU 10432 (MiB) [08/05/2024-00:36:26] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +303, GPU +285, now: CPU 3138, GPU 10740 (MiB) builder.py:401: DeprecationWarning: Use set_memory_pool_limit instead. builder_config.max_workspace_size = workspace_size (1024 1024) builder.py:109: DeprecationWarning: Use add_matrix_multiply instead. mult_all = network.add_fully_connected(input_tensor, 3 * hidden_size, Wall, Ball) builder.py:232: DeprecationWarning: Use add_matrix_multiply instead. attention_out_fc = network.add_fully_connected(attention_heads, hidden_size, W_aout, B_aout) builder.py:247: DeprecationWarning: Use add_matrix_multiply instead. mid_dense = network.add_fully_connected(attention_ln, config.intermediate_size, W_mid, B_mid) builder.py:292: DeprecationWarning: Use add_matrix_multiply instead. out_dense = network.add_fully_connected(intermediate_act, hidden_size, W_lout, B_lout) Traceback (most recent call last): File "builder.py", line 553, in
main()
File "builder.py", line 544, in main
with build_engine(args.batch_size, args.workspace_size, args.sequence_length, config, weights_dict, args.squad_json, args.vocab_file, calib_cache, args.calib_num) as engine:
File "builder.py", line 441, in build_engine
bert_out = bert_model(config, weights_dict, network, embeddings, mask_idx)
File "builder.py", line 312, in bert_model
out_layer = transformer_layer_opt(ss, config, init_dict, network, prev_input, input_mask)
File "builder.py", line 211, in transformer_layer_opt
context_transposed = attention_layeropt(prefix + "attention", config, init_dict, network, input_tensor, imask)
File "builder.py", line 102, in attention_layer_opt
Wall = init_dict[prefix + WQKV]
KeyError: 'l1_attention_self_qkv_kernel'