Not matching testing results

Fujiaoji commented 11 months ago

Hi, hope you doing well.

When I testing the results with your dataset, I find that the results are not matching. I use the pipe_line_eval code from phishpedia project and phishintetnion project, and find that the 190/25400 in benign samples predicted as phish in phishintention, while 354/25400 in phishpedia; For the phishing samples, 2652/25403 predicted as benign in phishintention and 2135/25403 of phishpedia. So phishpedia's TPR and accuracy is higher than the phishintention based on my testing. I am wondering if there are some factors will influence the results since it is not matching to the result of the paper. Thanks

lindsey98 commented 11 months ago

Hi Fujiao, May I check with you what Siamese threshold did you use for phishpedia dn phishintention?

Fujiao Ji @.***> 于2023年8月22日周二 01:12写道：

Hi, hope you doing well.

When I testing the results with your dataset, I find that the results are not matching. I use the pipe_line_eval code from phishpedia project and phishintetnion project, and find that the 190/25400 in benign samples predicted as phish in phishintention, while 354/25400 in phishpedia; For the phishing samples, 2652/25403 predicted as benign in phishintention and 2135/25403 of phishpedia. So phishpedia's TPR and accuracy is higher than the phishintention based on my testing. I am wondering if there are some factors will influence the results since it is not matching to the result of the paper. Thanks

— Reply to this email directly, view it on GitHub https://github.com/lindsey98/PhishIntention/issues/20, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMJCOK2X5YR3H633YSVZFPTXWOJJVANCNFSM6AAAAAA3YWCVHM . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Fujiaoji commented 11 months ago

May I check with you what Siamese threshold did you use for phishpedia

Sure， I use 0.87 for Phishintention and 0.83 for the phishpedia. I am not sure if I need to run all thresholds so I just follow the default value. If there is no other reasons, I will run all threshold you mentioned in one of the issue to see the results.

And for the phishpedia, I think I use the initial domain.pkl rather than the updated one, and I run the phishpedia by using phishpedia project pipeline_eval rather than pipeline_eval in phishintention.

Thanks

lindsey98 commented 11 months ago

Hi Fujiao, Can you try to use 0.83 for PhishIntention?

Fujiao Ji @.***> 于2023年8月23日周三 08:42写道：

May I check with you what Siamese threshold did you use for phishpedia

Sure， I use 0.87 for Phishintention and 0.83 for the phishpedia. I am not sure if I need to run all thresholds so I just follow the default value. If there is no other reasons, I will run all threshold you mentioned in one of the issue to see the results.

— Reply to this email directly, view it on GitHub https://github.com/lindsey98/PhishIntention/issues/20#issuecomment-1689099894, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMJCOKZTRIA67TV4M6V4PPLXWVGZFANCNFSM6AAAAAA3YWCVHM . You are receiving this because you commented.Message ID: @.***>

Fujiaoji commented 11 months ago

Hi Fujiao, Can you try to use 0.83 for PhishIntention? Fujiao Ji @.***> 于2023年8月23日周三 08:42写道： …

Sure. Thanks. I will update the testing result.

Fujiaoji commented 11 months ago

Hi Fujiao, Can you try to use 0.83 for PhishIntention? Fujiao Ji @.***> 于2023年8月23日周三 08:42写道： …

Hi, I have tested the 0.83 and found that for the phishing samples, the phishintention can predict 2406 as benign while phishpedia is 2135. There is a little difference. I will test these parameters [0.4, 0.5, 0.6, 0.7, 0.81, 0.83, 0.85, 0.87, 0.9, 0.93, 0.95, 0.97, 0.99, 0.9999] and check the tpr and fpr. Thanks

lindsey98 / PhishIntention

Not matching testing results #20