I have several questions

JosephHuang913 commented 2 years ago

Hi Filip Markovic,

Thanks for your great repository ‘mmwave-gesture-recognition’. It really helps me a lot. However I still have some questions that I don’t understand. I need to seek your advice

What is the definition of the variable ‘frame’ in the python code? Is it the same as the FMCW radar frame?
What is the physical meaning of the column “frame” in the data set? Why are there so many same frame values? The counts of the values are not the same.
What are the physical meanings of the other columns such as x, y, range_idx, peak value, doppler_idx, and xyz_q_format in the data set? What are the unit and range of these values? Did you re-scale these values to between 0 and 1?

Thanks for your help in advance. I wish to hear from you as soon as possible.

vilari-mickopf commented 2 years ago

data variable is raw chunk of data of unspecified size, i.e. I am reading everything that is available in the serial buffer. After parser.assemble function, frame will be raw bytearray, containing one header and all tlvs of that frame in bytes as specified in ti awr documentation. After parser.parse frame variable is parsed dict with corresponding keys/values specified in HEADER_FORMAT and TVLS_FORMAT.
frame in csv file represents frame count. Since one frame can have undefinable number of objects, each line will represent one detected object and frame will specify to which frame count that object belongs.
This is an updated version, but it is really similar to the older one that I am using (can't find the doc now). The only difference is that they have doppler in m/s, while I have indexes of peaks in doppler and range, and value of the peak. And yes, I am scaling everything before passing it through nn.

You can use print command in console.py to peak into data

JosephHuang913 commented 2 years ago

Hi Filip Markovic,

Thanks for your quick response. It really helps me a lot. I am very appreciate. Actually, I don’t have a TI AWR1642 mmWave sensor. I am using Socionext SC1220AT2 mmWave sensor. The data I got from the sensor is range profile (radar cube) in shape (NumChirps, RxAnt, RangeBin) per frame. Then, I can derive Range-Doppler heatmap and Range-Azimuth heatmap to obtain the features I needed from the range profile. To recognize gestures, I wonder if I can reuse the dataset you collected or even the model you trained? However, I have to manipulate the data received from SC1220AT2 to fit the format of the dataset you collected at first. Thus, I have several points to clarify:

Are the ‘objects’ you mentioned the ‘points’ in the point cloud? In other words, the red dots on the radar screen.
Does the (x, y) coordinate correspond to the horizontal and vertical axis on the radar screen respectively? What is the physical unit and physical range of x and y? The value of x and y in the dataset seems to vary from -32768 to 32767 (16-bit signed integer). However, why do you divide x and y with 'xyz_q_format' (x = x/desc['xyz_q_format'] and y = y/desc['xyz_q_format'])?
What does the ‘range_idx’ mean? Is it the index of range bin in range profile? If so, for example, does a range_idx 17 mean the distance 17 * 3.75 cm = 63.75 cm?
What does the peak_value mean? Is peak_value the signal strength of the object on the heatmap in log scale?
Is the doppler_idx also a 16-bit signed integer? Does a doppler_idx 5 correspond to the velocity 5 * V_res? What is the unit of velocity? m/s or cm/s?
The range of the values of the 5 features in the dataset varys so much. Is it reasonable to scale all the values by dividing by 65535? ?

vilari-mickopf commented 2 years ago

Yes, objects are red points drawn on the plotter.
I don't remember what units are (x, y) and I can't find the documentation for this. Regarding xyz_q_format, I remember seeing in documentation that it should be used in this way (x = x/desc['xyz_q_format'] and y = y/desc['xyz_q_format'])
(4. 5. 6.) I also don't remember this, so I can only make assumptions like you. I've recorded this data set a few years ago. I have used mmwave demo from TI and this is what was packed inside of the frames. I remember experimenting with range_idx/doppler_idx/peak and without, and having better results with this data included.

This dataset was recorded in a day or two with only my gestures, so I would highly recommend recording your own, it won't take you much. Also, after seeing this new mmwave demo version that has only (x, y) and doppler in m/s instead of idx/peaks, I think I will probably shift to that approach sometime in the future.

JosephHuang913 commented 2 years ago

Hi Filip Markovic,

Many thanks for your response. Normally, we will use meters and seconds as the physical unit. Hence, I suppose the unit of distance is meters and the unit of velocity is meters per second. I derived new features such as x, y, distances, angles, and velocities from the dataset you collected. If x or y is greater than 32767, it is subtracted by 65536 and then divided by 1024 since the Q format is Q.10. I also limit the range of x to between -1 meter and 1 meter and the range of y to between 0 and 1 meter. The distance d is actually the square root of x square plus y square. The range of d is between 0 and 1.414 meters. The angle θ is the arc tangent of x over y. The range of θ is between -π/2 and π/2. Velocities are actually the doppler indices multiplied by the velocity resolution of the mmWave radar. It is approximately between -2 m/s and 2 m/s. Then, I re-trained the model and found that the performance has improved obviously. Maybe you can consider to use the features I derived from the dataset. Screenshot_20220106_201933 Screenshot_20220106_201949 Screenshot_20220106_202002

vilari-mickopf commented 2 years ago

That's awesome, thanks for sharing your results. I will definitely change data handling when I find some time.

vilari-mickopf commented 11 months ago

Thank you once again for providing further valuable insights into the data. This has been with the latest version.

vilari-mickopf / mmwave-gesture-recognition

I have several questions #5