DADABox / revisiting-iot-device-identification

Data and code for TMA 2021 paper "Revisiting IoT Device Identification"
GNU General Public License v3.0
8 stars 4 forks source link

Query about the dataset #1

Open Qwaseem1 opened 1 year ago

Qwaseem1 commented 1 year ago

Dear Researchers,

I want to express my appreciation for your research paper and the corresponding GitHub code. Having thoroughly reviewed both, I am keenly interested in extending your work further. I have successfully downloaded the dataset you provided on Google Drive.

However, I've noticed a minor inconsistency between the dataset and the information provided in your paper. While the paper mentions the availability of data spanning 41 weeks, I found only a single day's worth of data in the Google Drive folder. Could you kindly clarify this disparity for me?

I am eager to build upon your research and contribute to its progression. Thank you for your time and consideration.

romankolcun commented 1 year ago

Hello,

could you, please, point me directly to the folder which contains only one day worth of data?

Kind regards, Roman

Qwaseem1 commented 1 year ago

Dear Respected Researcher,

Thank you very much for your response.

I have thoroughly examined the whole files you uploaded on google drive. In the "features_nov-apr\features_nov-apr.csv" folder, I noticed that the column time_start consistently displays the date 2019-12-16. This date remains unchanged throughout the entire column. Similarly, in the "stats_nov-apr\stats_nov_apr.csv" folder, the time column exclusively contains the date 2019-11-02. Essentially, both of these folders contain data of single specific dates only.

Also, while checking the "unsw_feature" folder, which contains complete .pcap files of the entire 6 months. I am bit confused to understand whether the data in the "unsw_feature" folder belongs to your own generated network or if it's from the UNSW dataset. Could you kindly provide some clarification on this matter?

Furthermore, I am eager to explore your dataset's diversity in a broader sense. I am specifically interested in discovering if your datasets possess any other types of diversity besides time. Do you have any data inside ua datafiles which is adjusted after changing the network. Specifically, I am very much interested in identifying variations in your dataset resulting from network modifications.

I highly appreciate your assistance regarding these aspects. It would immensely help me to understand your datasets. Thank you for your guidance.

Kind Regards,