Error (Keyframe, refframe, wrong translation info ...)

siamiz88 commented 5 years ago

When I run the provided code, I met the error "key frame 7". I guess this is caused by the omitted? or deleted data of images. (images.txt file starts from 000009.png file instead of 000001.png)
So, I slightly fixed my code and run again the code. -> Solved the keyframe error. However, the code couldn't compute the reference frame in this time. There is no difference between kf[0].Position() and kf[1].Position(). So I checked the translation info in the dictionary of views and found that they were all the same. [ 0.17589158 -0.09372958 0.2583495 ]

It seems that the code is not working with the current version.

holynski commented 5 years ago

Thanks for pointing this out. I'll look into this now.

siamiz88 commented 5 years ago

Thank you for your quick response. I look forward to your update :)

wlsh24 commented 5 years ago

Hello, I am also trying to run this code and here is my status:

So, I slightly fixed my code and run again the code. -> Solved the keyframe error. However, the code couldn't compute the reference frame in this time. There is no difference between kf[0].Position() and kf[1].Position(). So I checked the translation info in the dictionary of views and found that they were all the same. [ 0.17589158 -0.09372958 0.2583495 ]

It looks like the translation and orientation are overwritten everytime a new view is created. It can be avoided by putting the attributes into def __init__(self):. Furthermore, I believe the orientation is not set properly, according to [http://kieranwynn.github.io/pyquaternion/] the w,x,y,z attributes should be accessed using [0],[1],[2],[3] and not .w, .x, .y, .z
After correcting this problem, there is also an error when computing the optical flows: def GetFlow(image1, image2): flow = cv2.calcOpticalFlowFarneback(\ cv2.cvtColor(image1,cv2.COLOR_BGR2GRAY),\ cv2.cvtColor(image2,cv2.COLOR_BGR2GRAY),\ 0.5, 3, 100, 100, 7, 1.5, 0) a,b,c = cv2.split(image1) return cv2.merge((a,b)) This function doesn't really compute the flow and I guess it should return flow ?
Also the following line: flow_grad_magnitude[reliability > max_reliability] = magnitude is producing an error since you try to access with a boolean a 2d array.
Moreover, the reliability is not computed as in the paper it is just defined as reliability = np.zeros((flow.shape[0], flow.shape[1]))

I hope that these information will help fixing the code!

holynski commented 5 years ago

Very useful and detailed comments, thanks.

It turns out the code that was released is actually not the intended (final) version, but rather a much earlier (and incomplete) iteration. Unfortunately, I no longer have access to the original repository used for development, so I'm stuck re-implementing the missing parts.

This may take me another day or two. Apologies for the trouble.

flow-dev commented 5 years ago

How are you? The demo of siggraph asia 2018 was fantastic. I will be waiting for your re-implementation!!!

flamehaze1115 commented 5 years ago

How are you? The demo of siggraph asia 2018 was fantastic. I will be waiting for your re-implementation!!!

same!

holynski commented 5 years ago

Just committed a new version reimplementing the missing components. The code should now be feature-complete and working. Would you please let me know if everything works for you?

siamiz88 commented 5 years ago

Happy new year ~! Thank you for your updating. I will check and leave some comments.

Could you let me know how to obtain a future framework in the real-time application? Did you delay some frames to get 'past / current / future' frames?

holynski commented 5 years ago

Could you let me know how to obtain a future framework in the real-time application?

So, at the moment, the slow components (in order of slowest to fastest, tested on my laptop) are the following:

Solver: For the results presented in the paper, we used hierarchical preconditioning to reduce the number of overall solver iterations required (see the end of section 4.4 of the paper, or the original paper on the solver here). Adding this component is likely what will make the biggest difference.
Python loops: You may have noticed from my (less than elegant) Python code, but this is my first real foray into Python coding, and I otherwise code entirely in C++. So, the majority of image processing ops here are written in C-style loops, which is largely inefficient in Python. So, replacing these loops with vectorizeable operations (like OpenCV's matrix ops), will likely provide a big speedup. For examples of this, see the sections of code corresponding to Canny edges, reliability estimation, temporal median, and computing initialization.
Flow: To get even faster results, you can change the settings of the flow estimation to the faster preset. More specifically, change the "2" in the following line to "1" or "0": dis = cv2.optflow.createOptFlow_DIS(2)

Although -- if you're looking to run a real-time version of this code on a mobile phone, I would strongly suggest first porting this code to C/C++. The steps above should get you most of the way, and depending on your implementation, you may already be at real-time. To improve the speed even further, there are a number of other easy optimizations (mentioned in section 5.2 of the paper) that can be made to drastically reduce runtime. These include:

Implementing a real-time solver similar to the one described here.
Reducing the resolution at which the flow/reliability/gradient is computed. These are blurred after being computed anyways, so it likely doesn't matter if they're computed at a lower resolution and upsampled.
Optimizing/vectorizing the Canny implementation
Reducing the number of solver iterations at non-keyframes (places without new depth points), since we're not getting strong information about changes in 3D structure, and are mostly just moving object boundaries slightly.

Did you delay some frames to get 'past / current / future' frames?

In the provided code, the video is read all at once, and then frames are processed sequentially. When each frame is processed, optical flow is computed to both a past frame and a future frame, so yes, there is a slight delay from real-time. If you would like the code to instead ONLY rely on previous frames (i.e. remove the latency), you can do the following:

Remove the second loop from the function GetReferenceFrames() (remove the following code):

        for idx in range(view_id - 1, self.min_view_id, -1):
            if idx not in self.views:
                continue
            if (np.linalg.norm(pos -\
                              self.views[idx].Position()) > dist):
                ref.append(idx)
                break

Please let me know if this makes sense, and if you've managed to get the code working.

siamiz88 commented 5 years ago

The code is successfully running now. As opposed to my expectation, this requires significant time to generate the depth image. (1 image generation approximately requires an hour with 500 solver iterations) Is this normal '-'?

Anyway, thank you for your dedicated reply. I'm really touched.

holynski commented 5 years ago

For me, each frame is processed in about 2 minutes. This is running the sample data and with default settings on my laptop (a Macbook Pro).

It might be the case that the first image that is saved takes longer (since the code doesn't actually save the first X frames, to ensure the depth maps have initialized to a reasonable solution). If you'd like to disable this, you can set the value of skip_frames to zero before the main loop -- but keep in mind that the first few depth maps saved might not be great.

mpottinger commented 5 years ago

Just a heads up that the latest pyquaternion version has somehow broken this code. I had to downgrade to version 9.2 in order for it to work.

The original code seems to take about 2-4 minutes per frame on my core i5-9600k 3.7ghz desktop. Getting the first saved images took around an hour.

I was very interested in getting this to work real-time, so I tried my best at optimising it. I was able to get it down to 20 seconds per frame using Numba (a very nice JIT compiler tool for python), and changing the way the sparse matrix was populated before the solver.

The solver part takes 10 seconds for me, and the rest of the code takes the other 10. I am not well versed in the math behind this algorithm to use a different solver, so I think I am going to give up on getting it to run real time. Still, it was a fun exercise trying!

Looking forward to when Arkit/Arcore have this kind of functionality built-in, that will be truly amazing!

holynski commented 5 years ago

Just a heads up that the latest pyquaternion version has somehow broken this code. I had to downgrade to version 9.2 in order for it to work.

Thanks for pointing this out. I went ahead and updated the code to support the newest version. In doing so, I also realized that OpenCV has moved around their implementation of DISOpticalFlow, so I changed that too. You may need to upgrade to OpenCV 4.0 for everything to work. Please let me know if this works for you, so I can close this issue (I should have done this a while ago :-) )

As for timing -- if you really wanted to get a real-time working with the smallest amount of effort, I would suggest scaling down all of the inputs, and scaling them up at the very end using joint bilateral upsampling. I'd be glad to help if you were interested in implementing this.

roxanneluo commented 5 years ago

I tested it with pyquaternion 0.9.5 and opencv 4.0.0 and it works for me.

holynski commented 5 years ago

Great -- closing this issue.

Feel free to open another one if anything else comes up.

facebookresearch / AR-Depth

Error (Keyframe, refframe, wrong translation info ...) #2

Happy new year ~! Thank you for your updating. I will check and leave some comments.