lindsey98 / PhishIntention

PhishIntention: Phishing detection through webpage intention
MIT License
47 stars 12 forks source link

About OCR-aided Siamese Model #5

Open lindsey98 opened 2 years ago

lindsey98 commented 2 years ago

During training, we use the logo images to train a 277-brand classification task. During testing, we use the second last intermediate layer as the logo embedding, and discard the classification head. And the detailed architecture without a classification head is shown in Figure 7.

The conventional way of training Siamese is to use pairwise/triplet loss. However, it suffers from high computational costs and unstable convergence. Recent studies [1][2][3] have shown that classification-based loss has better generalization ability than the traditional pairwise/triplet loss. That's the reason why we use classification loss to train Siamese.