JehLeeKR / Multimodal_LLM_Phishing_Detection

2 stars 0 forks source link

Multimodal Large Language Models for Phishing Webpage Detection and Identification: Code repository

NOTE: The Pre-processed Benign and Phishing webpage dataset is published.
This dataset includes the Text features from HTML and Screenshots.

Thank you for your patience.
2024 Sep 18


Ethical Statement: This work is done for research purposes; the phishing webpages and logos should not be used for any unethical or illegal purpose.


Research Paper

Lee, J., Lim, P., Hooi, B., Divakaran, D. M. (2024, September). Multimodal Large Language Models for Phishing Webpage Detection and Identification. To appear in the Symposium on Electronic Crime Research (eCrime) 2024.


Dataset

Webpage dataset

Benign webpage samples: ./data/benign/
Phishing webpage samples: ./data/phishing/

APW (Adversarial Phishing Webpages) dataset

https://github.com/hihey54/www24_threatAdvPhish

Adversarial Logo dataset

Please contact to jehyun.lee (at) trustwave.com