nickgillian / grt

gesture recognition toolkit
859 stars 285 forks source link

Usable or Readeble way to show the output of a DTW model #87

Open JairoJs opened 8 years ago

JairoJs commented 8 years ago

Hello again, Can I get some help (as you always do) at outputting the predicted classLabel of a DTW model in a Usable way. I mean when I perform a gesture (using a leap motion) the class label change across the time I am performing that gesture and at the end of the movement I dont always get the Class that corresponds to that gesture. I am kind of confused since I though I could rely on the test accuracy given by the model at training stage, that was a 93% of accuracy using 10 fold cross validation.

azarus commented 8 years ago

Are you feeding the gesture data one by one or you record the entire gesture and feed it all at once? What i did is record the gesture and store it in a matrixdouble then feed the entire recorded gesture to the predict function.

JairoJs commented 8 years ago

Hello thanks for answering, at first was giving it frame by frame, but then I implemented the way you suggest by storing the data on a timeseries matrixdouble, The problem I get with this is that I havent being able to calculate the begining of the movement so I can begin storing data. Currently I have a window of time that I display on screen to tell users when to start with the gesture, that way I know exactly when to start recording but this aproach makes NUI principles go to trash. Do you have any suggestions o how can I tell if the gesture has begin (and ended)? thanks again. Cheers

cyberluke commented 8 years ago

The matrixdouble feed thing is only during recording, right? Because with live data you feed it sample by sample realtime.

I'm making my own hardware and I have experience with Kinect as well. What I did is a WiFi ring with motion sensor. For recording of gestures I added also a button. Another way is to detect some special gesture like shake or zoom/pinch.

One workaround I'm doing when prototyping is to take a wireless mouse in another hand and click the button in the air. Or you can get some foot pedal from a MIDI controller/keyboard.

Of course this is just for recording stage, so I think it's ok for NUI. You don't have to mark start + end of gesture during realtime recognizing.

azarus commented 8 years ago

You can try continously buffer your data, and perform a prediction on a set of recorded data from the last 1-2 seconds.

JairoJs commented 8 years ago

Hey, thanks yes I am currently doing that, it is the best aproach so far, however I am getting conflicts with other gestures, the static ones. I really apreciate your help. I will keep you update

JairoJs commented 8 years ago

hello @cyberluke I am having trouble to understand you, I am not working on a hardware, and I cant implement a physical button cause it is not the way the system is intended to work. Thanks for the reply though

cyberluke commented 8 years ago

@JairoJs ...for static gestures you use Classification algorithm like ANBC. For dynamic gestures that are changing over time, you use Timeseries algorithm like DTW. Of course there can be a situation, when you perform move for DTW, but you start from some static posture, which will be recognized by ANBC. Then you have to make some priority like prefer DTW result over ANBC and add a little delay for static postures - like you have to stop. What works for me is to simplify the solution and use decomposition. Split a complex task into more simple tasks. Then you will get a better prediction results. It's like mixing apples n oranges if you put everything into one pipeline.

cyberluke commented 8 years ago

@azarus regarding the buffering of 1-2 sec data...isn't GRT doing this internally? For DTW you have Timeseries data class. Therefore it's not just one vector, it's array of vectors. Using realtime prediction, you feed it sample by sample and it will recognize it in the right time. Custom buffering adds just another layer of delay. The only case I could get a better DTW result was to send sensor data only when I press some button.

azarus commented 8 years ago

@cyberluke It does, thats what i meant originally, by storing the data in GRT's TimeSeriesData class and then send it to the predict function when the recording is done.