xyanchen / WiFi-CSI-Sensing-Benchmark

MIT License
315 stars 55 forks source link

UT-HAR 数据集怎么预处理的? #12

Closed n830024282 closed 4 months ago

n830024282 commented 7 months ago

请问UT-HAR这个数据集是怎么预处理的,应该不是直接用的Wifi_Activity_Recognition using LSTM里面的数据集?打印出来想问每一列代表什么,好像第一列不是时间戳数据,那时间戳的数据是哪一部分哦? image

Marsrocky commented 4 months ago

UT-HAR每个都是一整段的subcarrier x time_length,没有做分割,没有ground truth。所以论文里写了,只能使用sliding window切割成了大概5000个sample,每个sample都是subcarrier x 时间。

XGBoost commented 3 months ago

@Marsrocky 感谢作者的回复,很赞的工作,为后来做wifi感知研究的人提供了很大的便捷和帮助! 想问一下作者,这里我从excel提取到的数据的shape为(3977, 250, 90),这里的250是子载波的数量,然后90是时间维度吗?同时想问一下不同样本之间是否在时间维度上有overlap,如果有的话是多大呢?谢谢

XGBoost commented 3 months ago

@Marsrocky 但是看论文里面又说到子载波是30个,有三个接收的天线,会不会这个维度是90呢?“UT-HAR [33] is the first public CSI dataset for human activity recognition. It consists of seven categories and is collected via Intel 5300 NIC with 3 pairs of antennas that record 30 subcarriers per pair. All the data is collected in the same environment. However, its data is collected continuously and has no golden labels for activity segmentation. Following existing works [56], the data is segmented using a sliding window, inevitably causing many repeated data among samples. Hence, though the total number of samples reaches around 5000, it is a small dataset with intrinsic drawbacks.”