cochaviz / iot_classifier

IoT device classification based on packet size
3 stars 1 forks source link

Optimize preprocessing #3

Open cochaviz opened 1 year ago

cochaviz commented 1 year ago

Slicing into time windows is very slow, as is calculating the mode/mean/median.

Mahira2010 commented 1 year ago

[+] Slicing the entire provided dataset now takes approximately 3 seconds. Please note, slicing now happens in standard intervals; meaning the start of the window is not anymore the first packet in the window. This is because this happened to be too time-consuming. This approach reduces the running time and does not harm functionality.

[+] The window_id is stored as a new column (not anymore passed as a parameter)

[+] Calculation of the statistical measures of the entire provided dataset now takes approximately 11 minutes. Here, the statistical measures are computed based on Mac Address for each window.

[+] Additionally the method mode_mean_med_without_window has been added to compute the statistical measures of the entire provided dataset based on Mac Address (neglecting the window)

[+] The statistical results are saved in a generated CSV file

However, both calculation methods of the statistical measures do not seem to reproduce the results obtained by Pinheiro, Bezerra et al. Further investigation is required.