Realtime? - Githubissues

mikecann commented 3 years ago

Hi,

I am thinking about working on a project for tracking a squash ball on a squash court in real-time. This is a difficult problem to solve and have been looking for existing libaries to help.

I was wondering if this library could do real-time or does it only support post-processsing?

gwjensen commented 3 years ago

This library only supports post processing as it was developed with the constraint of using high-speed cameras ( greater than 700 fps ). For a camera of this speed to be real-time, one only has less than 1.5 ms to process the image before a new image is recieved.

That being said, I can offer some suggestions of how one would do such a project as you describe, but I would need a little more information.

What is your definition of real-time? What speed are you looking to achieve? How much of a lag is tolerated? Humans can only process around 50 fps visually anyways, so much higher than that doesn't help if it is to be viewed live. Where the higher speeds can help, however, is in the tracking of the ball as the movement of the ball in each frame would be less; this makes it easier for an algorithm like a filter to following the trajectory of the ball.
How many cameras are you using? Do they all see the same thing, or are they only marginally overlapping?
Does the system need to calculate the 3d trajectories in real-time (as defined from qestion 1)?
How much will the players be blocking the view of the ball by the camera(s)?

mikecann commented 3 years ago

Oh nice thankyou for offering your advice. Just a little more background first before I get to your specific questions.

We are wanting to build a system such as the one shown here: https://www.youtube.com/watch?v=z8uK_ugHzcE that system however costs something like $50k to buy and you have to fit out the whole court. We were wondering if it would be possible to build something simmilar using some highspeed cameras and a projector.

we want to be able to calculate the exact location where the ball contacts the back wall and ideally also know the velocity it was travelling at the time. The framerate isnt set in stone but I suspect as low as possible that gets the job done given the hardware requirements of higher framerates.
if we mount the cameras in the back two corners of the court we were hoping to get away with just two cameras, there might be some minimal obstruction when the player is near the front of the court but it should be okay 99% of the time.
3D trajectories arent neccessary but it would be nice to have, but I think they might be needed for ball velocity.
I dont think they will be blocking all that often.

We have some example footage taken from an iphone at various framerates from the back of the court here: https://drive.google.com/drive/folders/1-2lkG9TtgUyUPxOxsl3YVt50m6KhRIbo?usp=sharing

We obviously would use proper high-speed cameras but out budget is limited so they would probably have to be fairly inexpensive ones.

Again I really appreciate you taking the time to help me think this through.

gwjensen commented 3 years ago

That youtube video was very helpful for me to better understand the goal. There are a few things, in this specific type of scenario that help make this problem quite a bit easier. The first thing is, I'm assuming, the court is a standart size. This means that we know apriori the size of the back wall, length of standard markers on the court, and the ball has a standard size. From this knowledge alone, one can, in theory, use a single camera to do tracking of the ball. Several constraints would have to be put into the system when calculating position of the ball, but it is possible ( the reality might not be quite as simple).

The big tasks for an online method are determining where the ball is (is it even in frame?). This also includes finding the actual true position of the ball. What I mean by this is that, depending on the speed of the ball and the speed of the camera, there will be ghosting effects on the ball. Though, I think ellipse fitting methods constrained to a specific size should work well here. Deformation of the ball according to contact might also have to be taken into consideration, but thats also a solveable problem, in my opinion.

Watching the 240 fps video, it seems that something around that range should be fine for an initial prototype. It doesn't make sense to go much higher than that, unless you know you need it, and based on what I've seen so far, I'm not sure you do. The good thing is that is speed, or there about, is fairly easy to find in the industrial camera market. E.g. the types of cameras they use on conveyor belts to look for bad fruit or faulty items in a factory. They are usually pretty reasonably priced ~$300-500. But youll need to do some tests to look at what kind of resolution and focal length you need for this application.

Using two cameras would, obviously, allow you to do stereo vision and get depth that way, but in my experience 3 cameras is a much better minimal set up cameras than 2. The main reason for this being that 3 cameras have a more stable configuration when it comes to optimizing the camera matrices in the presence of noise. Incidentally, more than 1 camera also means the cameras need to be configured any time they move. For this reason, I think an initial prototype would be easier to do with 1 camera, and then only if the single camera wasn"t enough to go to a larger number. The benefit of this being that a lot of the unknowns regarding the application space can be addressed and figured out using the single camera, and additional cameras would build upon that. Plus, if you can"t get the real-time processing working for a single camera, how are you going to do it for 2+? :)

mikecann commented 3 years ago

Awesome, thanks for your helpful advice.

So the hardware is feesable, the question then becomes the software. Is this software capable of doing it in realtime?

Are we even looking at the wrong way of doing it? If we are only interested in the position the ball strikes the wall and not the velocity of impact could we get a way with a pair of ultrasonic sensors perhaps?

gwjensen commented 3 years ago

If by "this" software you mean SnakeStrike, then no; it wasn't written for real-time purpose because of prior mentioned camera contraints.

If, instead with "this", you mean software encompassing the methods I outlined above, then the answer is maybe, but highly dependent on the implemenation and heuristics/constraints. For example, one of the largest computational costs will be determining if a ball is in frame or not. Depending on how this function is implemented could mean the difference between a very fast system and a very sluggish system. The reason being that is it can be kept low in computational cost, then roughly half of the time an image won"t need to be processed as the ball is out of view. Then regarding real-time, there are tricks that can be played regarding processing and location, e.g. we don't care about the ball after its struck the wall and is returning back, that can provide further decreases to required processing. However, you don"t necessary need real-time for your usage, what you reallz need is a response time fast enough that it doesn"t interfere with play but not really any faster; this provides another processing buffer.

In a nutshell, my gut feeling is that a system, as I have outlined above could be processed online in a manner such that the player wouldn't notice any real delay, but it is heavily dependent on the implementation of the software and the hardware resources for processing that are available.

mikecann commented 3 years ago

Sorry for the long delay. Yes by "this" I meant SnakeStrike.

Okay it sounds like we would have some coding to do if we wanted to do this.

Cheers.

gwjensen / SnakeStrike

Realtime? #1