Fundamental issue with information difference between user and backend

One fundamental issue to the general approach is that of the difference of information between the user's view of the page and that which the backend uses. This difference of information can cause the backend to see different things from the user, and thus make different decisions about whether that site includes phishing or not.

Some ways are easier to solve: for example, the views of the backend and user differ if their screen resolutions differ (e.g. vertical screens, not full screen windows, mobile screens, higher resolution screens, etc). This could mean a logo or text is visible on one page, but not the other. Given the current approach for logo detection (top 3 with highest logo prob%), an attacker with full knowledge of the system can make a targeted website to avoid this by placing logos with higher confidence in a region that is only visible to the backend, but not the user. Can be fixed by sending over the client resolution, so the backend uses that resolution for making a screenshot. Will require updating e.g. the logo detection, current dataset only has 1920x1080 AFAIK.

Another issue is that the webserver hosting the phishing site may simply serve a different webpage when the backend connects to it versus when a user connects to it. This is trivial to pull off if the backend is a single server with a fixed IP address (determining the address is trivial, just visit the webserver and make the backend run a phishing detection, check the webserver logs for connecting IP). Best fix is probably to make the client send a screenshot, will also solve the issue mentioned above.

The last issue is more fundamental, as it is solely based on the fact that the backend uses a screenshot. For example, a phishing site may first have a page with full PayPal branding (logo, name and all), but no login form. Instead, it has a simple link button saying 'login', at which point a visitor is transferred to a page with a login form. This page does not need to have the same degree of branding, as the users knowledge already contains the connection to PayPal, so they know that they have to fill in their PayPal login details. However, the backend has not seen this previous page, so it will not have seen any branding, and simply think this is a generic login page. Due to the established connection in the visitors head with PayPal, the login page can even show other generic branding like 'money account login' without raising too much suspicion from an unaware visitor (for many the lack of branding would alert them, I hope). This new generic branding can be made to give back the phishing site on reverse image search, so the extension will detect this as not phishing.

zerohour-phishing-detection / zpd-server

Fundamental issue with information difference between user and backend #4