Closed minjea1588 closed 11 months ago
Hi minjea1588:
I'm gonna need more information from you to debug this.
ubuntu18.04 triton docker : nvcr.io/nvidia/tritonserver:21.10-py3 deepstream docker : nvcr.io/nvidia/deepstream:6.0-triton
triton config.pbtxt config.txt
deepstream config dstest_yolo_nopostprocess_v8_pose_triton.txt
`def pose_src_pad_buffer_probe(pad, info, u_data): t = time.time()
frame_number = 0
num_rects = 0
gst_buffer = info.get_buffer()
if not gst_buffer:
print("Unable to get GstBuffer ")
return
batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
l_frame = batch_meta.frame_meta_list
while l_frame is not None:
try:
frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
except StopIteration:
break
frame_number = frame_meta.frame_num
num_rects = frame_meta.num_obj_meta
pad_index = frame_meta.pad_index
l_usr = frame_meta.frame_user_meta_list
while l_usr is not None:
try:
# Casting l_obj.data to pyds.NvDsUserMeta
user_meta = pyds.NvDsUserMeta.cast(l_usr.data)
except StopIteration:
break
# get tensor output
if (user_meta.base_meta.meta_type !=
pyds.NvDsMetaType.NVDSINFER_TENSOR_OUTPUT_META): # NVDSINFER_TENSOR_OUTPUT_META
try:
l_usr = l_usr.next
except StopIteration:
break
continue
try:
tensor_meta = pyds.NvDsInferTensorMeta.cast(
user_meta.user_meta_data)
assert tensor_meta.num_output_layers == 1, f'Check number of model output layer : {tensor_meta.num_output_layers}'
# layer_output_info = layers_info[0]
layer_output_info = pyds.get_nvds_LayerInfo(tensor_meta, 0) # as num_output_layers == 1
network_info = tensor_meta.network_info
input_shape = (network_info.width, network_info.height)
if frame_number == 0 :
print(f'\tmodel input_shape : {input_shape}')
# remove zeros from both ends of the array. 'b' : 'both'
dims = np.trim_zeros(layer_output_info.inferDims.d, 'b')
if frame_number == 0 :
print(f'\tModel output dimension from LayerInfo: {dims}')
output_message = f'\tCheck model output shape: {layer_output_info.inferDims.numElements}, '
output_message += f'given OUT_SHAPE : {dims}'
assert layer_output_info.inferDims.numElements == np.prod(dims), output_message
# load float* buffer to python
cdata_type = data_type_map[layer_output_info.dataType]
ptr = ctypes.cast(pyds.get_ptr(layer_output_info.buffer),
ctypes.POINTER(cdata_type))
# Determine the size of the array
out = np.ctypeslib.as_array(ptr, shape=dims)
if frame_number == 0 :
print(f'\tLoad Model Output From LayerInfo. Output Shape : {out.shape}')
# [Optional] Postprocess for YOLOv7-pose(with YoloLayer_TRT_v7.0 Layer) prediction tensor
# (https://github.com/nanmi/yolov7-pose/)
# (57001, 1, 1) > (57000, 1, 1) > (1000, 57)
# out = out[1:, ...].reshape(-1 , 57) # or out.squeeze()[1:].reshape(-1 , 57)
# ----------------------------------------------------------------------------------------------------------------------
# Explicitly specify batch dimensions
if np.ndim(out) < 3:
out = out[np.newaxis, :]
# print(f'add axis 0 for model output : {out.shape}')
# [Optional] Postprocess for yolov8-pose prediction tensor
# (https://github.com/triple-Mu/YOLOv8-TensorRT/tree/triplemu/pose-infer)
# (batch, 56, 8400) >(batch, 8400, 56) for yolov8
out = out.transpose((0, 2, 1))
# out = map_to_zero_one_copy(out)
# make pseudo class prob
cls_prob = np.ones((out.shape[0], out.shape[1], 1), dtype=np.uint8)
out[..., :4] = map_to_zero_one(out[..., :4]) # scalar prob to [0, 1]
# insert pseudo class prob into predictions
out = np.concatenate((out[..., :5], cls_prob, out[..., 5:]), axis=-1)
out[..., [0, 2]] = out[..., [0, 2]] * network_info.width # scale to screen width
out[..., [1, 3]] = out[..., [1, 3]] * network_info.height # scale to screen height
# ----------------------------------------------------------------------------------------------------------------------
output_shape = (MUXER_OUTPUT_HEIGHT, MUXER_OUTPUT_WIDTH)
if frame_number == 0 :
print(f'\tModel output : {out.shape}, The coordinates of the keypoint are rescaled to (h, w) : {output_shape}')
print("out : ", out.shape)
pred = postprocess(out, output_shape, input_shape,
conf_thres=conf_thres, iou_thres=iou_thres)
boxes, confs, kpts = pred
# print("boxex, confs ", boxes, confs)
if len(boxes) > 0 and len(confs) > 0 and len(kpts) > 0:
add_obj_meta(frame_meta, batch_meta, boxes[0], confs[0])
dispaly_frame_pose(frame_meta, batch_meta,
boxes[0], confs[0], kpts[0])
except StopIteration:
break
try:
l_usr = l_usr.next
except StopIteration:
break
# update frame rate through this probe
stream_index = "stream{0}".format(frame_meta.pad_index)
global perf_data
perf_data.update_fps(stream_index)
try:
# indicate inference is performed on the frame
frame_meta.bInferDone = True
l_frame = l_frame.next
except StopIteration:
break
return Gst.PadProbeReturn.OK`
When using nvinfer, it works normally, but when using triton server, the error occurs and does not proceed to the next frame.
I guess something wrong in dstest_yolo_nopostprocess_v8_pose_triton.txt:
Maybe should not use 'custom_lib'
postprocess {
labelfile_path: "/opt/nvidia/deepstream/deepstream-6.0/sources/deepstream_python_apps/apps/deepstream-imagedata-multistream/labels.txt"
other {}
}
extra {
copy_input_to_host_buffers: false
}
custom_lib {
path: "/opt/nvidia/deepstream/deepstream-6.0/sources/libs/nvdsinfer_customparser/libnvds_infercustomparser.so"
}
And Check shape of output array like this It should be "(batch, 56, 8400)"
# (batch, 56, 8400) >(batch, 8400, 56) for yolov8
print(f'out.shape : {out.shape}')
out = out.transpose((0, 2, 1))
Also check scalar in def map_to_zero_one
too
def map_to_zero_one(scalar):
print('[map_to_zero_one] scalar\n {scalar}')
print('[map_to_zero_one] scalar.shape : {scalar.shape}')
scalar_min = np.min(scalar)
scalar_max = np.max(scalar)
mapped = (scalar - scalar_min) / (scalar_max - scalar_min)
return mapped
I deleted custom_lib in dstest_yolo_nopostprocess_v8_pose_triton.txt, but the same problem still appears. The shape before transposing is (1, 56, 8400). [map_to_zero_one] scalar.shape: (1, 8400, 4). But still the same problem is appearing.
your shape of output and scalar seems correct.
The error message "overflow encountered in subtract" indicates that the operation scalar − scalar_min
is causing an overflow, likely because the values in scalar or scalar_min are too large or too small to fit within the data type.
Check Data Types: Ensure that the data type of scalar can handle the range of values you're dealing with. If you're using NumPy arrays, you can check the data type with scalar.dtype.
Check Values: Print or log the minimum and maximum values within scalar. This will help you understand if there are any extremely large or small values that could be causing the overflow.
import numpy as np
def map_to_zero_one(scalar):
print(f'[map_to_zero_one] scalar.shape : {scalar.shape}')
print(f'[map_to_zero_one] scalar.dtype : {scalar.dtype}')
scalar_min = np.min(scalar)
scalar_max = np.max(scalar)
print(f'[map_to_zero_one] scalar_min : {scalar_min}')
print(f'[map_to_zero_one] scalar_max : {scalar_max}')
mapped = (scalar - scalar_min) / (scalar_max - scalar_min)
return mapped
It seems that the problem occurs because nan is entered in the scalar value.
out[..., :4]
means coordinates of bounding box.
I didn't encounter this problem when I used NVInfer instead of NVInferserver.
Try use np.nanmin
、np.nanmax
replace np.min
、np.max
in def map_to_zero_one(scalar)
This may solve the problems encountered during the normalization stage of the bbox coordinates, but the underlying reason is that the bbox coordinates should not theoretically have NaN, and perhaps other problems will be encountered in the post-processing stage.
thank you for the reply. It seems that nvinfer, external tritonserver, and deepstreamtriton all have different data structures. I'll have to find another way to preprocess it.
Good luck in finding a solution.Please let me know if you figure out the correct way to set up NVInferserver, if it's convenient for you, thank you!
It's been a while. I found it. When infernece with triton, the network_info.width and network_info.height values are 0,0. If change this part to 640 640, it will work!
It's been a while. I found it. When infernece with triton, the network_info.width and network_info.height values are 0,0. If change this part to 640 640, it will work!
That's great. Congratulations.
I'm keeping this code specifically for debugging information and dynamic scaling.
There is one more thing: when a 640x640 model is displayed on the screen at 1920x1080, it is displayed as such. 640x640 works fine. What processing should be done when operating at 1920x1080?
Because the model is inferred to be 640x640, it has to be scaled to the size you want to display
Here is display and rescale related code: You might consider setting input_shape to a constant.
MUXER_OUTPUT_WIDTH = 640 # stream input
MUXER_OUTPUT_HEIGHT = 360 # stream input
TILED_OUTPUT_WIDTH = 1280 # stream output
TILED_OUTPUT_HEIGHT = 720 # stream output
def pose_src_pad_buffer_probe(pad, info, u_data):
...
network_info = tensor_meta.network_info
input_shape = (network_info.width, network_info.height)
...
out[..., [0, 2]] = out[..., [0, 2]] * network_info.width # scale to screen width
out[..., [1, 3]] = out[..., [1, 3]] * network_info.height # scale to screen height
hello! Thank you for making a good source.
I am running yolov8-pose with tritonserver and running
pose_src_pad_buffer_probe
function, and I confirmed that an error occurs in this partout[..., :4] = map_to_zero_one(out[..., :4])
. Is there any solution?