Open psiydown opened 2 years ago
Hi @psiydown, since I am working on this stuff right now I did some changes to the code and minor modification must be done in order to use the scripts, especially for extract_onnxs_from_metrabs.py
and the fact that I extracted the weight and bias of the final convolution. If you need to run everything without modification I need something like 1-2 more weeks, otherwise you can try to understand what it is going on in the code (sometimes you just need to change the path of the file). I am using a RTX2080Ti
Hi @psiydown, I finally made a clear guide to have everything working! I tested it and just by cloning the repo, running the scripts in modules/hpe/setup
in order and then modules/hpe/hpe.py
I managed to run inference.
Hi @StefanoBerti ,Good job, it's a great job! I changed a new rtx3060 graphics card and tested it. It runs well and can reach 16 FPS! There are several questions:
After using the new rtx3060 graphics card, running your old version of the program can reach 21 FPS. Although the recognition accuracy is not as high as the current version, the current version of the program is only 16 FPS. Do you know whether it is related to the following errors or other reasons?
Warning when building Yolo engine:
[!] Input tensor: input | Received unexpected dtype: int32.
Note: Expected type: float32
Prompt when running "main.py"(No impact on run):
QWindowsContext: OleInitialize() failed: "COM error 0xffffffff80010106 RPC_E_CHANGED_MODE (Unknown error 0x080010106)"
and
[W] trt-runner-N0-03/10/22-22:56:18 | Was activated but never deactivated. This could cause a memory leak!
Running requires less memory than the original model, but the loading speed is still very slow every time you open it. Do you have any way to improve the loading speed?
In addition, I am also interested in the action recognition module, How is the "trx.onnx" model generated or trained? Can it compare and score continuous action segments?
You did a very good job, Thank you very much!
Thank you!
hpe.py
). It seems like you are doing something wrong hereHi @StefanoBerti ,
Can your engine use batch predict? The Metrobs model uses batch predict, the video speed is relatively fast. If your engine acceleration use batch predict, should be faster or even real-time! I tried to implement it with reference to Metroabs, but failed. Can you help me implement it? Thank you!
Your TRX guide says "Download weight from Google drive", but I can't find the download link in the file.
I hope you can send me "fix_223.pth", "yolo.engine", "bbone.engine", "image_transformation.engine" comparison test. Thank you very much!
1) By using batch size greater than one you can surely increase a lot the throughput of the model, anyway my target is robotic application, so I need a batch size equal to one and I am interested in latency. I can't understand what you mean when you say that batch predict should make inference real-time 2) The guide was just a reminder for me since this is still a work in progress 3) You can try this checkpoint which is my last one and should perform very well
Hi @StefanoBerti , You are right. Batch prediction can only improve throughput, not real-time. I repeatedly predicted the human posture of the same video file and found that the prediction accuracy of the creation engine was lower than that of the original metroabs model. I want to know whether the conversion onnx results in a reduction in accuracy or the creation of an engine results in a reduction in accuracy. Do you have code for me to test the prediction accuracy with the onnx model converted by metroabs? Thank you!
@psiydown Really? That's interesting!
Weeks ago I did some test between the model in TensorFlow, ONNX and TensorRT, but I didn't find any significative difference.
I think that the difference in accuracies comes from the preprocessing and postprocessing steps, in particular I didn't add the implausible poses suppression because I don't have the file that is used during inference in MetrABS (https://github.com/isarandi/metrabs/issues/34), I don't know the real values used in is_within_fov
, I don't know which is the test time augmentation factor used in MetrABS etc...
Do you have big differences between the original model and the engine?
Hi @StefanoBerti , Download this video and use the engine to predict. It is found that the left hand swings disorderly in frame 57, the left foot mutation in frame 791, and the whole video predict that the foot slides and often stands unstable. The original metroabs model predicts that there are no such errors and the effect is very good, but it loads slowly and occupies memory. The prediction speed is not as fast as your engine. If your engine accuracy can reach the accuracy of the original model, it will be perfect!
I tested the original model of metroabs, reduced number of test-time augmentations num_aug=1
, and turned off to inhibit bone length suppress_implausible_poses=False
, but it didn't make the same mistakes as the engine. Postprocessing is not the main reason affecting the accuracy. If I can test code the predict of onnx model, it is possible to find out the reason.
@psiydown It could be that you are making the same mistake that I did few time ago. The video has resolution 1920x1080, which means that it has an aspect ratio of 1920/1080=16:9. If you resize an image with that shape into the shape 640x480, you are changing the aspect ratio from 16:9 to 640/480=4:3. The correct way to preprocess such video is
img = img[:, 240:-240, :]; img = cv2.resize(img, (640, 480))
Could this be your problem?
@StefanoBerti No, I didn't make the same mistake. I pay great attention to precision and detail. I have scaled the video with the same proportion as the original video, and the excess border is filled with black.
@psiydown maybe you are calling reconstruct_absolute
? Anyway to test the ONNX model you just need to change the TensorRT runner with an ONNX runner such like ONNXRuntine or the one of Polygraphy
@StefanoBerti According to your tips, I successfully use Onnxrtrunner of Polygraphy to predict with onnx model. However, the predict result is the same as that of the engine, which shows that the accuracy is not reduced by creating the engine. It may be that the accuracy has been reduced when converting the onnx model, or as you said, it is caused by preprocessing or post-processing. Can you test the video I sent to find out the reasons and solutions for the decline in accuracy? Thank you!
Because I need to get the absolute pose, I use reconstruct_ absolute
@psiydown What do you mean with "testing the video"? I tried it and it works quite well imo. Anyway I don't use the reconstruct_absolute
function.
I think that the fact that you use reconstruct_absolute
is the problem, because previously I was using it, but now not and I didn't upload it anymore. You can try to comment the line
pred3d = reconstruct_absolute(pred2d, pred3d, new_K, is_predicted_to_be_in_fov, weak_perspective=False)
which should be somewhere near line 152 in hpe.py
and see if it is the problem. The values in is_within_fov
may not be correct.
@StefanoBerti This is not the reason, I tested to delete reconstruct_absolute
this line, but it has no impact on the accuracy. Deleting it will only fix the human body in the middle of the scene and the human body will not move.
I recorded a test video to better illustrate the problem. At the beginning, the hand of the video character was fixed behind the back, but the predicted green character swung the hand behind the back twice. Please pay attention to the position pointed by the mouse at the beginning.
@psiydown ok thank you now I got what you mean! Well that pose is very hard to estimate btw, I don't know how the original model could handle it. Since you have tested reconstruct_absolute
and the bone length suppression, some things still to consider are:
@StefanoBerti I read the source code of metabs and set the parameter antialias_ factor=1
, Turn off antialiasing. Reference gamma coding increases the brightness of the picture, but changing these methods does not affect the accuracy.
I have no other way, I pay more attention to model accuracy and loading speed than predict speed. Do you have any way to use the original model, reduce the memory occupation and improve the loading speed? I hope you can give me specific tips. Thank you very much!
@psiydown if these changes in accuracy are fundamentals for your implementation then yes, I think that using the original model works better for you. I have no clue what else is missing or is wrong here.
You can use the XLA optimization of TensorFlow to increase the number of FPS to 4.5, in my case it worked.
Let me know if you manage to discover how to improve the accuracy!
@StefanoBerti If I find out how to improve the accuracy, I'll let you know at the first time. Try XLA can indeed improve the predict speed, but the loading speed of the original model is still very slow, Each loading takes up 11g of memory by waiting for a few minutes. I found that the metabs model extracted by your program is relatively small. The extracted original model does not need to be converted onnx, Can it be predict directly? How to realize it?
@psiydown sorry but I didn't understand your question. There are a lot of preprocessing and postprocessing that I tried to replicate accurately to avoid to use TensorFlow, but if you want to use the original model you need to accept to use TensorFlow. Anyway, you can find how to reduce the GPU memory used by TensorFlow
Hi @StefanoBerti ,I noticed that your repo was updated, I tested, but the following errors occurred:
Run "modules/hpe/metrabs_trt/utils/extract_onnxs_from_metrabs.py"
Run "modules/hpe/metrabs_trt/utils/from_pytorch_to_onnx.py"
If I create an engine files from an old onnx files, and run "main.py", prompt that these files cannot be found:
How do I get or create these files, Can you send these files to me for testing?