Gene Fingerprinting (GF) is a Website Fingerprinting (WF) attack that makes use of the timing and size of packets to generate fingerprints that reflect the intrinsic profile of a website. GF leverages Zero-shot Learning (ZSL) to reduce the effort to collect data from different tunnels and achieve a real open-world.
The repository contains the code for the GF attack, our implementations of known WF attacks, the the dataset for closed- and open-world evaluation.
@article{LiGZ22,
author = {Ding Li and
Chunxiang Gu and
Yuefei Zhu},
title = {Gene Fingerprinting: Cracking Encrypted Tunnel with Zero-Shot Learning},
journal = {{IEICE} Trans. Inf. Syst.},
volume = {105},
number = {6},
pages = {1172--1184},
year = {2022},
doi = {10.1587/transinf.2021edp7179},
}
:warning: Note: The code and dataset are for NON-COMMERCIAL RESEARCH USE ONLY!
We publish the dataset of web traffic traces used for attack evaluation. The dataset is available here:
The dataset contains traffic samples used for closed- and open-world evaluation. This includes the list of monitored and unmonitored sites, and the packet traces we collected while visiting those sites over different tunnels.
The dataset consists of the following seven parts: | No. | Name | Type | Sites | Visits | Tunnel | Purpose | Scenario |
---|---|---|---|---|---|---|---|---|
1 | MS-ETH | Monitored | 100 | 10 | Ethernet | Training | Closed- & Open-world | |
2 | MS-SSH | Monitored | 100 | 10 | OpenSSH | Test | Closed- & Open-world | |
3 | MS-SS | Monitored | 100 | 10 | Shadowsocks | Test | Closed- & Open-world | |
4 | MS-VPN | Monitored | 100 | 10 | OpenVPN | Test | Closed- & Open-world | |
5 | UMS-SSH | Unmonitored | 1709 | 1 | OpenSSH | Test | Open-world | |
6 | UMS-SS | Unmonitored | 1709 | 1 | Shadowsocks | Test | Open-world | |
7 | UMS-VPN | Unmonitored | 1709 | 1 | OpenVPN | Test | Open-world |
The following table lists the tunnel settings for capturing web traffic. | Client | Proxy | Ciphersuite | |
---|---|---|---|---|
OpenSSH | windows_8.1p1 | 8.2p1 | chacha20-poly1305 | |
Shadowsocks | windows 4.4.0 | libev 3.3.4 | chacha20-ietf-poly1305 | |
OpenVPN | connect-3.2.3 | as_2.8.7 | AES-256-GCM |
Each part of the dataset consists of traffic samples in CSV format, with the file name:
site-<y>-sample-<i>.csv -- for monitored sites
site-<y>.csv -- for unmonitored sites
<y> : website number
<i> : sample number
Each file is a web traffic trace captured during a visit, and each line in the file contains a packet's timestamp, direction, and payload size. E.g.
timing,direction,size
0.0,out,0
0.00662994384765625,in,0
0.0066950321197509766,out,0
0.0073430538177490234,out,517
0.013636112213134766,in,0
0.03990316390991211,in,1460
0.04000711441040039,in,1460
0.04002714157104492,out,0
Before feature extraction, please put the downloaded dataset in the data
folder.
To extract features for all WF attacks, use the following command without arguments:
> python src\extraction.py
Or, extract features of a particular dataset for a certain attack.
Use the argument -h
to get the help message:
> python src\extraction.py -h
usage: extraction [-h] [-d DATASET] attack
Website Fingerprinting Feature Extraction
positional arguments:
attack attack, can be one of GF, CUMUL, DF, DDTW.
optional arguments:
-h, --help show this help message and exit
-d DATASET traffic dataset, can be one of MS-ETH, MS-SSH, MS-SS, MS-VPN,
UMS-SSH, UMS-SS, UMS-VPN.
To save time on feature extraction, download the extracted features and corresponding labels from here:
It contains two different files:
X_<dataset>_<attack>.pkl -- feature vectors
y_<dataset>.pkl -- labels
<dataset> : One of the seven different datasets: MS-ETH/SSH/SS/VPN and UMS-SSH/SS/VPN.
<attack> : One of the attacks: GF, CUMUL, DF, and DDTW.
Likewise, please put these files in the data
folder.
To run the DDTW attack, please properly install the following Python library first:
To evaluate the GF attack in the closed-world, use the following command:
> python src\evaluation.py GF
Closed-World Evaluation on GF
Training ETH
Electing: 100%|█████████████████████████████████████████████████| 100/100 [00:00<00:00, 133.44it/s]
Testing SSH
Predict: 100%|█████████████████████████████████████████████████| 1000/1000 [00:15<00:00, 63.43it/s]
Accuracy: 97.10 %
Testing SS
Predict: 100%|█████████████████████████████████████████████████| 1000/1000 [00:20<00:00, 48.24it/s]
Accuracy: 94.10 %
Testing VPN
Predict: 100%|█████████████████████████████████████████████████| 1000/1000 [00:16<00:00, 59.32it/s]
Accuracy: 95.20 %
To evaluate all WF attacks in the closed-world, simply use the following command without arguments:
> python src\evaluation.py
Or, evaluate a particular attack using its own command. E.g., the help message for the command of GF closed-world evaluation:
> python src\ClosedWorld_GF.py -h
usage: ClosedWorld_GF [-h] [-e [1 2 ... 10]] [-k [1 3 5]] [-w [0.1 0.2 ... 1]]
Evaluation program of Gene Fingerprinting Attack in Closed-World
optional arguments:
-h, --help show this help message and exit
-e [1 2 ... 10] number of exemplars.
-k [1 3 5] k-Nearest Neighbor.
-w [0.1 0.2 ... 1] window scale for warping path.
To evaluate the GF attack in the open-world, use the following command:
> python src\evaluation.py GF -o
Open-World Evaluation on GF
Training ETH
Distance: 100%|█████████████████████████████████████████████████| 1000/1000 [00:15<00:00, 66.55it/s]
Testing SSH
Distance: 100%|█████████████████████████████████████████████████| 2709/2709 [00:52<00:00, 51.14it/s]
TPR: 88.70 % FNR: 11.30 % FPR: 9.25 % Precision: 84.88 %
Testing SS
Distance: 100%|█████████████████████████████████████████████████| 2709/2709 [01:05<00:00, 41.47it/s]
TPR: 80.90 % FNR: 19.10 % FPR: 7.67 % Precision: 86.06 %
Testing VPN
Distance: 100%|█████████████████████████████████████████████████| 2709/2709 [00:52<00:00, 51.61it/s]
TPR: 89.50 % FNR: 10.50 % FPR: 10.53 % Precision: 83.26 %
To evaluate all WF attacks in the open-world, simply use the following command:
> python src\evaluation.py -o
Or, evaluate a particular attack using its own command. E.g., the help message for the command of GF open-world evaluation:
> python src\OpenWorld_GF.py -h
usage: OpenWorld_GF [-h] [-e [1 2 ... 10]] [-w [0.1 0.2 ... 1]]
[-q [0 0.05 ... 1]]
Evaluation program of Gene Fingerprinting Attack in Open-World
optional arguments:
-h, --help show this help message and exit
-e [1 2 ... 10] number of exemplars.
-w [0.1 0.2 ... 1] window scale for warping path.
-q [0 0.05 ... 1] quantile for threshold.