google / safebrowsing

Safe Browsing API Go Client
Apache License 2.0
470 stars 129 forks source link

Stop database update #80

Closed davidTAU closed 6 years ago

davidTAU commented 6 years ago

Hi, Is there any way I can prevent the sbserver from updating the local threat list database? I am trying to use the safebrowsing Update API to label a dataset of web browsing history. The dataset is fixed (there will be no new entries), therefor I don't need an updated threat list. Thanks

dsnet commented 6 years ago

If you already have a local database that is decently up-to-date, I suppose you can set the srvaddr flag to some bogus value such that an update never happens.

colonelxc commented 6 years ago

There's a couple difficulties in trying to use the Safe Browsing API this way (not specific to this client).

Safe Browsing keeps the DB as small as possible while providing maximum coverage. Which means that sites that are no longer serving phishing/malware content are removed from the lists. So scanning a dataset that is a month old with an up-to-date dataset might not return the malware pages, even if Safe Browsing was aware of them (and blocking them) at the time they were active. This is presumably part of why you want to freeze the database.

The other part is that the database only contains hash prefixes, which can (and do) have benign collisions. The clients request the full hash for their partial match and then do client side determination of whether or not it was a true full match. So, without contacting the server, it isn't guaranteed that any partial match actually matches what is in the DB. The (now updated) server would respond with empty lists of full hashes to any partial request that had since been removed from the database.

pravee9 commented 6 years ago

how to create the local threat list database ? please tell me the step by step to setup the database in local computer

davidTAU commented 6 years ago

I needed this for some research on browsing record, as @colonelxc had observed. @pravee9 , I know that is done by -db db_filename.

I decided to use the current threat list, even though it might give results that are not consistent with the status of the urls when they were actually visited in my data.

Thank you all for the responses.

pravee9 commented 6 years ago

I am not understand I have api key and I just download the database So how to download the database or setup the local database for threat list ?