munhouiani / Deep-Packet

Pytorch implementation of deep packet: a novel approach for encrypted traffic classification using deep learning
MIT License
183 stars 56 forks source link

关于数据标签的问题 #25

Closed JieJayCao closed 2 years ago

JieJayCao commented 2 years ago

您好,非常感谢您复现了deep-packet。 在使用您preprocess.py文件时,我发现您对流量数据的标签按照应用(app)分类时只包括了Non-VPN,即

AIM chat

'aim_chat_3a': 0,
'aim_chat_3b': 0,
'aimchat1': 0,
'aimchat2': 0,

但是ISCXVPN2016原数据集中还包括了例如vpn_aim_chat,这一部分vpn数据是不考虑在app分类中吗?

非常期待您的回答

munhouiani commented 2 years ago

Hi,

在處理資料的時候我盡可能處理得跟 Paper 寫的一樣,在 section 4.2.1 Labeling Dataset 中是這麼說的:

For application identification, all pcap files labelled as a particular application which were collected during a nonVPN session, are aggregated into a single file.

因此我並沒有把 vpn 放到 application classification 的資料集中。