lindsey98 / Phishpedia

Official Implementation of "Phishpedia: A Hybrid Deep Learning Based Approach to Visually Identify Phishing Webpages" USENIX'21
Creative Commons Zero v1.0 Universal
129 stars 45 forks source link

logo_feat_list= (0,) -> failing to test with shot.png #33

Closed ClovisDyArx closed 1 month ago

ClovisDyArx commented 1 month ago

Hey there,

I am trying to test the model on a single example. I provided the url of the phishing website in the html.txt and a screenshot in shot.png.

However I get this error when testing with the screenshot : (python phishpedia.py --folder datasets/test_sites)

File ".../Phishpedia/logo_matching.py", line 179, in pred_brand sim_list = logo_feat_list @ img_feat.T # take dot product for every pair of embeddings (Cosine Similarity) ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2048 is different from 0)

I printed the logo_feat_list and its (0,).

I am not sure what I am missing there, I already fixed a lot of issues to build the project but this is something I don't really know what to do.

Thanks for the help

lindsey98 commented 1 month ago

Hi, you need to move everything under expand_targetlist/expand_targetlist to expand_targetlist/, such that there is no nested directories.

ClovisDyArx commented 1 month ago

I did as you said:

models/expand_targetlist/expand_targetlist/ => models/expand_targetlist/

The tree command (-d) displays this:

models
└── expand_targetlist
    ├── 1&1 Ionos
    ├── Absa Group
    .....

Here is my datasets folder, just in case I might do somehing wrong:

 datasets/test_sites
└── telegram

But I still have the same issues:

Traceback (most recent call last):
  File "phishpedia.py", line 175, in <module>
    logo_recog_time, logo_match_time = phishpedia_cls.test_orig_phishpedia(url, screenshot_path, html_path)
  File "phishpedia.py", line 95, in test_orig_phishpedia
    pred_target, matched_domain, matched_coord, siamese_conf = check_domain_brand_inconsistency(
  File "/home/clovinux/Desktop/ing3/pfee/2024-pfe-phishing/backend_ai/Phishpedia/logo_matching.py", line 38, in check_domain_brand_inconsistency
    matched_target, matched_domain, this_conf = pred_brand(model, domain_map,
  File "/home/clovinux/Desktop/ing3/pfee/2024-pfe-phishing/backend_ai/Phishpedia/logo_matching.py", line 179, in pred_brand
    sim_list = logo_feat_list @ img_feat.T  # take dot product for every pair of embeddings (Cosine Similarity)
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2048 is different from 0)
ClovisDyArx commented 1 month ago

Okay, this fixed the issue.

The problem was that you need to delete LOGO_FEATS.npy and LOGO_FILES.npy to "reload" properly.