NOTE: The Pre-processed Benign and Phishing webpage dataset is published.
This dataset includes the Text features from HTML and Screenshots.
Thank you for your patience.
2024 Sep 18
Ethical Statement: This work is done for research purposes; the phishing webpages and logos should not be used for any unethical or illegal purpose.
Lee, J., Lim, P., Hooi, B., Divakaran, D. M. (2024, September). Multimodal Large Language Models for Phishing Webpage Detection and Identification. To appear in the Symposium on Electronic Crime Research (eCrime) 2024.
Benign webpage samples: ./data/benign/
Phishing webpage samples: ./data/phishing/
https://github.com/hihey54/www24_threatAdvPhish
Please contact to jehyun.lee (at) trustwave.com