lindsey98 / Phishpedia

Official Implementation of "Phishpedia: A Hybrid Deep Learning Based Approach to Visually Identify Phishing Webpages" USENIX'21
Creative Commons Zero v1.0 Universal
129 stars 45 forks source link
computer-vision cybersecurity phishing-detection

Phishpedia A Hybrid Deep Learning Based Approach to Visually Identify Phishing Webpages

![Dialogues](https://img.shields.io/badge/Proctected\_Brands\_Size-277-green?style=flat-square) ![Dialogues](https://img.shields.io/badge/Phishing\_Benchmark\_Size-30k-green?style=flat-square)

PaperWebsiteVideoDatasetCitation

Framework

Input: A URL and its screenshot Output: Phish/Benign, Phishing target

Project structure

:pushpin: We need to move everything under expand_targetlist/expand_targetlist to expand_targetlist/ so that there are no nested directories.

- models/
|___ rcnn_bet365.pth
|___ faster_rcnn.yaml
|___ resnetv2_rgb_new.pth.tar
|___ expand_targetlist/
  |___ Adobe/
  |___ Amazon/
  |___ ......
|___ domain_map.pkl
- logo_recog.py: Deep Object Detection Model
- logo_matching.py: Deep Siamese Model 
- configs.yaml: Configuration file
- phishpedia.py: Main script

Instructions

Requirements:

  1. Create a local clone of Phishpedia

    git clone https://github.com/lindsey98/Phishpedia.git
  2. Setup the phishpedia conda environment. In this step, we would be installing the core dependencies of Phishpedia such as pytorch, and detectron2. In addition, we would also download the model checkpoints and brand reference list. This step may take some time.

    chmod +x ./setup.sh
    export ENV_NAME="phishpedia" 
    ./setup.sh
  3. conda activate phishpedia
  4. Run in bash

    python phishpedia.py --folder <folder you want to test e.g. ./datasets/test_sites>

The testing folder should be in the structure of:

test_site_1
|__ info.txt (Write the URL)
|__ shot.png (Save the screenshot)
test_site_2
|__ info.txt (Write the URL)
|__ shot.png (Save the screenshot)
......

Miscellaneous

Citation

If you find our work useful in your research, please consider citing our paper by:

@inproceedings{lin2021phishpedia,
  title={Phishpedia: A Hybrid Deep Learning Based Approach to Visually Identify Phishing Webpages},
  author={Lin, Yun and Liu, Ruofan and Divakaran, Dinil Mon and Ng, Jun Yang and Chan, Qing Zhou and Lu, Yiwen and Si, Yuxuan and Zhang, Fan and Dong, Jin Song},
  booktitle={30th $\{$USENIX$\}$ Security Symposium ($\{$USENIX$\}$ Security 21)},
  year={2021}
}

Contacts

If you have any issues running our code, you can raise an issue or send an email to liu.ruofan16@u.nus.edu, lin_yun@sjtu.edu.cn, and dcsdjs@nus.edu.sg