Open gateway opened 3 years ago
Seems to be similar to #56. You might be running an older version of PyTorch. Can you upgrade to PyTorch 1.7 and try again?
well, crap.. ok I created a virtual env with conda with python 3.7 and installed conda install opencv pytorch torchvision -c pytorch now it seems to be working again.. .. thanks..
conda create --name midas python=3.7
conda activate midas
conda install opencv pytorch torchvision -c pytorch
btw anyone tried this on equirectangular images yet? I do a lot of 360 photography and wondering if this would help out with depth map creation.
Great!
With respect to equirectangular images: we haven't tried, but would be curious to hear what you find out. The extreme distortions encountered in spherical cameras are not present in the training datasets, so I don't really have high expectations here. However, MiDaS has surprised us before, for example, that it works surprisingly well when applied to cartoons and paintings.
Here is an example of a 360 pano then the depth created.. Kinda. complex example..
Seems like the sharp edges of things get a bit fuzzy and not as sharp as I would think. Any additional params I can try to play around with this and the last question would be would it be possible to train some sort of inside 360 panos from tours.. How many images would we need?
Thanks, works better than I would have expected. With respect to parameters, there really is not much you can play with. The only thing is the resolution at which prediction happens (the resize operation in the transform, currently at 384x384). Increasing this might give sharper results, but our experience is also that while results might look better, they are overall less accurate. You could also try to apply MiDaS to the raw images and then stitch it it together, same as you do with the RGB images.
The required number of images is really hard to tell in advance as it depends on your accuracy requirements, the diversity of data, as well as quality of the ground truth. Obviously the more images the better. Maybe the ReDWeb dataset can act as guidance: it contains about 3500 images with very diverse content and reasonably good ground truth. This already gets you quite far.
Thanks, works better than I would have expected. With respect to parameters, there really is not much you can play with. The only thing is the resolution at which prediction happens (the resize operation in the transform, currently at 384x384). Increasing this might give sharper results, but our experience is also that while results might look better, they are overall less accurate. You could also try to apply MiDaS to the raw images and then stitch it it together, same as you do with the RGB images.
What do you mean to the raw images? They are fisheye and would probably not work as well?
The required number of images is really hard to tell in advance as it depends on your accuracy requirements, the diversity of data, as well as quality of the ground truth. Obviously the more images the better. Maybe the ReDWeb dataset can act as guidance: it contains about 3500 images with very diverse content and reasonably good ground truth. This already gets you quite far.
Hmm, how can I train this data set, I downloaded it and it has a lot of images.. I have a few Titan RTX's.. We in the 360 virtual tour community are looking for solutions to help us get better depth.. again thank you for taking the time to reply! Cheers
So I just updated today on my Ubuntu 18.x box which seems to have broken something for me since I can no longer run.. I just did a test right before doing a pull.. thoughts?