Closed Fujiaoji closed 1 year ago
Hi fujiao,
Yes I tried to update the domain matching function to include top-level-domain as well. Can you try replacing the old domain.pkl with new one https://drive.google.com/file/d/1nTIC6311dvdY4cGsrI4c3WMndSauuHSm/view?usp=sharing. Thanks!
Best, Ruofan
Fujiao Ji @.***> 于2023年6月24日周六 04:31写道:
Hi,
Hope you doing well.
I have run your code of phishpedia, but now I find that you have updated the code.
When I check the "phishpedia_classifier_logo" function and run the code on some benign websites, I find that the "matched_domain" always not include ".com", ".cn", etc., while the "tldextract.extract(url).domain + '.' + tldextract.extract(url).suffix" not include these information. So I am wondering if there is a bug or you also need to update the domain.pkl. I am not sure, just put forward my question. Thanks
Best
Fujiao
— Reply to this email directly, view it on GitHub https://github.com/lindsey98/Phishpedia/issues/16, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMJCOK2WA3RATSKQDGPIKPDXMX4LPANCNFSM6AAAAAAZSAD62U . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Oh, got it. Thanks
I saw these 2 domain.pkl, and found that the former one can produce this while the new one contains many bytes in the "cert_der", is there having a good way to open the new domain.pkl? And what is the meaning of "cert_der". I saw the same code in phishpedia of opening this file. So I am wondering if there is an updated code. Thanks
Hi Fujiao, sorry that I think the former one should be correct, I wrongly uploaded.
I changed it by checking "matched_domain" with "tldextract.extract(url).domain" only, thanks
Okay, got it. I would like to double check, the phishintention and phishpedia use the same maintained domain.pkl, is this right? Thanks
Hi Fujiao, Yes they are.
Fujiao Ji @.***> 于2023年10月29日周日 04:42写道:
Okay, got it. I would like to double check, the phishintention and phishpedia use the same maintained domain.pkl, is this right? Thanks
— Reply to this email directly, view it on GitHub https://github.com/lindsey98/Phishpedia/issues/16#issuecomment-1783917201, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMJCOK4JIPZAZ4SAHD23LZLYBVU4DAVCNFSM6AAAAAAZSAD62WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBTHEYTOMRQGE . You are receiving this because you commented.Message ID: @.***>
Okay. Helps me a lot. Thanks
Hi,
Hope you doing well.
I have run your code of phishpedia, but now I find that you have updated the code.
When I check the "phishpedia_classifier_logo" function and run the code on some benign websites, I find that the "matched_domain" always not include ".com", ".cn", etc., while the "tldextract.extract(url).domain + '.' + tldextract.extract(url).suffix" include these information. So I am wondering if there is a bug or you also need to update the domain.pkl. I am not sure, just put forward my question.
Besides, I want to make sure what I am understanding is right. For the "phishpedia_classifier_logo" function, if the predict target brand is not None and the domain in the maintained domain list, then it should be benign and the output pred_target is None rather than the real brand?
Thanks
Best
Fujiao