Closed KomputerMaster64 closed 1 week ago
Is it recommended to edit the dictionary videoname_to_object
by adding new key value pair for the red_block
?
Note: the mentioned dictionary can be found here in the datareader.py
file, inside the YcbineaotReader
class.
I was thinking about the error message, could it be attributed to the mask's file format. The mustard0
demo has the mask file in .png
format while as the provided binary mask in the red_block
demo is .jpg
.
This solved the error, but the tracking was not accurate, I will be sharing the sample video.
mustard0 mask
red_block mask
Update: @wenbowen123 I am able to run the model for the novel object (red wooden block), however it is not able ot calculate the 6D pose as well as the Mustard Bottle Demo.
The Issue #44 as a reference was really helpful.
I have attached the link to the detection and tracking video file as a Google Drive File for your kind reference.
Hi, your gdrive link is not accessible.
In args, can you set the debug level to 3 and share the logged debug folder? This will save more running details and I can help check further.
Thank you for the response. I have set the debug vairable to level 3 and have also added the reference files to the Google Drive Folder Here
From the bbox viz, it seems like your depth could be wrong. However, the zip file does not contain all the debugging info, disallowing me to check further. There should be other things such as scene**.ply files, model_tf.obj, viz_refine.png, viz_score.png, etc.
The name of the mask should be consistent with the name of the first frame of your rgb image. This error should be due to the fact that the mask was not read.
__
From the bbox viz, it seems like your depth could be wrong. However, the zip file does not contain all the debugging info, disallowing me to check further. There should be other things such as scene**.ply files, model_tf.obj, viz_refine.png, viz_score.png, etc.
The current file used to call the demo is run_demo_red_block_02.py
(which can be found here in the Google Drive Folder). I used the trimesh
scaling as shared in the comment under the Issue #44 here:
trimesh.units.unit_conversion('millimeters', 'meters')
mesh = trimesh.load(args.mesh_file)
mesh.apply_scale(0.001)
mesh.export('scaled_down_file.obj')
I have uploaded the new demo FoundationPose_demo_02.zip
directory here in the Google Drive Folder.
Remarks:
est_refine_iter
(default=5) and track_refine_iter
(default=2) DID NOT HELP.Additionally, It would be great if you could guide me to get all the debugging info (and even the scene**.ply
files, model_tf.obj
, viz_refine.png
, viz_score.png
, etc.).
__
The name of the mask should be consistent with the name of the first frame of your rgb image. This error should be due to the fact that the mask was not read.
(DONE)
@KomputerMaster64 the debugging info are saved in the debug_dir https://github.com/NVlabs/FoundationPose/blob/fce37c86be33ee49cf0b46ee29ed26d613521d6d/run_demo.py#L23
If you didn't change it, by default it is under the repo (where you cloned) dir.
something wrong is going on, cause even the first frame is not correct. So it's probably not tracking drift, but setup issue. Send the debug folder so I can know better.
__
@KomputerMaster64 the debugging info are saved in the debug_dir
If you didn't change it, by default it is under the repo (where you cloned) dir.
Thanks for for the response.
__
something wrong is going on, cause even the first frame is not correct. So it's probably not tracking drift, but setup issue. Send the debug folder so I can know better.
I have uploaded the new debug.zip
file here (this is corresponding to the FoundationPose_demo_02.zip
) in the Google Drive Folder.
The FoundationPose_demo_02.zip
file has the correct camera sensor intrinsics, the aligned Depth and RGB images, as well as the CAD model with the correct scaling.
check your ob_mask.png in the debug dir, which seems not aligned with your rgb. It is also skewed. You might have done some unsuitable operation.
__
check your ob_mask.png in the debug dir, which seems not aligned with your rgb. It is also skewed. You might have done some unsuitable operation.
Thank you for pointing that out. I changed the scipt for the binary masks generation, the mask now looks more accurate.
I have uploaded the new debug.zip
file here (this is corresponding to the FoundationPose_demo_04.zip
) in the Google Drive Folder.
The FoundationPose_demo_04.zip
file has the correct aligned Depth and RGB images.
Remarks: The estimation and tracking are really well for the first few hundred frames, i.e. till image 1714129801937.png
(157 images).
After which, the predicted pose starts drifitng and warping.
__ The correct mask for the first image.
__ The last frame with the correct prediction, and the frame after which the prediction was not optimal.
the bottom two images show a significant jump which is not very suitable for tracking. You can increase the camera frame rate if possible. Otherwise, you can also try with higher iter https://github.com/NVlabs/FoundationPose/blob/fce37c86be33ee49cf0b46ee29ed26d613521d6d/run_demo.py#L21
the bottom two images show a significant jump which is not very suitable for tracking. You can increase the camera frame rate if possible. Otherwise, you can also try with higher iter
Hi @wenbowen123
I am sorry for the delayed response.
I ran the experiments with values for track_refine_iter
variable ranging from 2
to 50
, however it was observed that the tracking invariably drifted towards the end samples (last 15 to 20 percent of the video).
I would be delighted to share the files.
Regards, Prabhav
__
Edit: I am sharing the failure case for the case with track_refine_iter
variable set to 50
.
Remarks: For the shown case, the tracking failure happends towards the end of the samples. The tracking is optimal till image 1714129813249.png
i.e. for the first 410
samples (Total of 619
image samples).
Additional Remarks: This is an improvement it to the first case where track_refine_iter
variable was set to 2
, and the tracking was optimal only upto the 157
image samples.
I have uploaded the new FoundationPose_demo_05_eri-5_tri-50.zip
file here (this is corresponding to the track_refine_iter
variable set to 50
) in the Google Drive Folder.
Question/Note: The total size of the debugging directories for the 49 experiments is 7 GB. It would be great if you could confirm if it is required to share all of the experiments' files.
__ The last frame with the correct tracking (top), and the frame after which the trakcing was not optimal (bottom).
The
run_demo.py
file was modified for the novel red wooden block object.I have referred to the #44 Issue here which was helpful, however I require further guidance.
I provided respective files for the subdirectories in
rgb
,depth
,mask
, andmesh
. Additionally provided thecam_K.txt
file. I also changed the parser arguments accordingly.I have added the reference files to the Google Drive Folder Here:
run_demo_red_block.py
,cam_K.txt
, AND the other files inrgb
,depth
,mask
, andmesh
have been added.Note: Updated K matrix corresponding to the Intel RealSense Camera by using the command
rostopic echo /camera/color/camera_info
.I am getting the main error as:
I am using an NVIDIA RTX A4500 and an Intel Xeon Silver CPU (details about the hardware have been mentioned at the very end)
Note: Given below are the outputs for
run_demo_red_block.py
command, and the following commands:conda info
,conda list
,g++ --version
,nvcc --version
,lscpu
, andnvidia-smi
.Here is the output for the command
python run_demo_red_block.py
:This is the fulll output of the
python run_demo_red_block.py
file.Here is the output for the command
conda info
:Here is the output for the command
conda list
:Here is the output for the command
g++ --version
:Here is the output for the command
nvcc --version
:Information about the CPU
Intel CPU Information
Information about the GPU
NVIDIA GPU Information