fabiotosi92 / NeRF-Supervised-Deep-Stereo

A novel paradigm for collecting and generating stereo training data using neural rendering
https://nerfstereo.github.io/
MIT License
349 stars 19 forks source link

Is there any method to determine if the generated dataset is qualified? #25

Closed guwenxiang1 closed 1 year ago

guwenxiang1 commented 1 year ago

I have generated some datasets using a similar method, but I am unsure how to determine if the generated datasets have any issues. To address this, I have used the trilateral photometric loss mentioned in the paper to evaluate the generated datasets. Is this approach reasonable? Or, when you generate datasets, do you have specific metrics to evaluate the generated datasets, or do you only consider their performance as datasets? I am concerned that after creating the complete dataset, I may find that the results are not satisfactory, indicating a problem in a previous step. How do you avoid this issue in the process of dataset creation?

guwenxiang1 commented 1 year ago

Below is the code I used for testing, and the methods and repositories used are the same as those in the repository. `def readImgAndDisparity(file_path): imgs_tensor = [] for fp in file_path: imgs=Image.open(fp) transform = transforms.Compose([transforms.ToTensor()]) tensor = transform(imgs) tensor = torch.unsqueeze(tensor, dim=0) tensor = tensor.to('cuda') imgs_tensor.append(tensor) return imgs_tensor

def Cal_Photometric_loss_in_files(fps, depth=False): flists = [] loss = 0 for fp in fps: flist = os.listdir(fp) flists.append(flist) flen = len(flists[0]) print(flen) for idx in range(0,flen-1): fp = [] for jdx in range(0,4): fp.append(fps[jdx]+'/'+flists[jdx][idx]) imgs = readImgAndDisparity(fp)

if depth :

#  imgs[0] = depth2disparity(imgs[0])
loss = loss + image_loss(imgs[0],imgs[1],imgs[2],imgs[3])

return loss/flen

folder_paths = [ 'stereo_dataset_v1_part1/0000/Q/baseline_0.50/disparity', 'stereo_dataset_v1_part1/0000/Q/center', 'stereo_dataset_v1_part1/0000/Q/baseline_0.50/left', 'stereo_dataset_v1_part1/0000/Q/baseline_0.50/right' ] print(folder_paths) print(Cal_Photometric_loss_in_files(folder_paths))`

fabiotosi92 commented 1 year ago

There are several methods that you can employ (even in combination) to assess the validity, which we have also outlined in the supplementary material:

  1. Adjust the rendered disparity maps generated by Instant-NGP by fitting a scale-shift pair of values for each triplet. This correction is performed in a self-supervised manner by optimizing a scale-shift pair of values for each triplet. (Supplementary, Section 1. Additional Implementation Details)
  2. Validate against any existing stereo algorithm (both traditional and deep learning-based) to ensure consistency in the disparities. (Supplementary, Section 4. Ambient Occlusion as Rendering Confidence)
  3. Exploit the uncertainty map (AO) to discard synthesized RGB images with high uncertainty levels. (Supplementary, Section 5. Rendering Failure Cases)
guwenxiang1 commented 1 year ago

Thank you for your response. For distant scenes like 0-5 meters, how should the baseline be chosen? It seems that your dataset mainly consists of closer scenes. Do you have any experience in this regard? Is there any mapping curve or guideline for selecting the scene depth and baseline? Thank you for your selfless sharing!

fabiotosi92 commented 1 year ago

It depends significantly on the subsequent task/application that utilizes those stereo pairs. In our dataset, we actually used multiple baselines rather than a single one. We opted for closer scenes because, in general, Instant-NGP performs better in these contexts. However, other contemporary frameworks may also be applicable for larger scenes.

guwenxiang1 commented 1 year ago

It depends significantly on the subsequent task/application that utilizes those stereo pairs. In our dataset, we actually used multiple baselines rather than a single one. We opted for closer scenes because, in general, Instant-NGP performs better in these contexts. However, other contemporary frameworks may also be applicable for larger scenes.

Thank you once again for your patient and prompt response!