MetaMask / eth-phishing-detect

Utility for detecting phishing domains targeting Web3 users
Other
1.1k stars 955 forks source link

Hosts file #631

Closed fjcrabiffosse closed 6 years ago

fjcrabiffosse commented 6 years ago

The whitelist and blacklist you maintain is very useful but is maintained in a format that doesn't allow its reusability with other extensions. I think it would be very beneficial for the community if you could provide the blacklist file using also a typical HOSTS.txt file format (e.g. http://winhelp2002.mvps.org/hosts.txt). That way the list could be integrated in other extensions (e.g. uMatrix) or blocked at OS level.

I could prepare a quick and dirty script to export your JSON file to said format but the interesting part would be if the list is actively maintained and updated with your new additions. Do you find this idea interesting?

Thanks!

409H commented 6 years ago

Hi @fjcrabiffosse - sure if you could write a script to convert our format into a typical hosts file, we can then put it in the CI to rebuild on every PR.

fjcrabiffosse commented 6 years ago

Hi @409H

The following python script should do.

import json

def metamask2json(input_file_path, output_file_path):
    """
    Parses the metamask config.json and outputs the list of blacklisted domains in a hosts file format.
    :param input_file: A json file containing a 'blacklist' key with a list of blacklisted domains.
    :return: Writes to the output_file_path the blacklisted domains using a hosts file format.
    """
    with open(input_file_path) as json_file:
        domains = json.load(json_file)
    blacklist = domains['blacklist']
    hosts_list = [('127.0.0.1' + ' ' + 'localhost' + '\n')] + ['0.0.0.0' + ' ' + blacklisted + '\n' for blacklisted in blacklist]

    with open(output_file_path, 'w') as hosts_file:
        hosts_file.writelines(hosts_list)

if __name__ == '__main__':
    metamask2json('./config.json', './hosts.txt')

Let me know if you would like any changes.

409H commented 6 years ago

Nice! I'll see if I can get this out of the door by Monday (January 15) - I'll keep this issue open until it's done.

fjcrabiffosse commented 6 years ago

Thanks! You are doing an awesome job for the community, happy to help.

409H commented 6 years ago

@fjcrabiffosse Hi, sorry, I was busy all weekend. Will bump it up the priority list for this week, apologies.

fjcrabiffosse commented 6 years ago

Hi!

I understand this is a low priority thing, so don't feel much rush for it. Did you encounter any problems adding the script to the CI pipeline? I think providing your blacklist in other formats is still valuable to the community but if you don't want to integrate this export in your CI maybe we can think of other ways of supporting it. Any ideas? Best regards!

409H commented 6 years ago

Hi @fjcrabiffosse - sorry for the delay, I'm bumping this up the priority list.

409H commented 6 years ago

@fjcrabiffosse I've committed an up-to-date version of the hosts file (https://github.com/MetaMask/eth-phishing-detect/blob/master/src/hosts.txt) as a .txt file so you can pull down and parse.

This will get the ball rolling and I've made a note to put it into the CI or work on some automation for this.

Sometimes I'm not around my dev environment so I can't run the python script manually, but will run it once a day until I can automate this.

fjcrabiffosse commented 6 years ago

Awesome!

Here is a snippet of uMatrix using your malware list.

409H commented 6 years ago

@fjcrabiffosse Awesome!

Steven Black (in this issue for a similar list I maintain - https://github.com/409H/EtherAddressLookup/issues/203) is consuming and putting into a hosts file on his repo - https://github.com/StevenBlack/hosts/blob/master/readme.md

Perhaps we could open a communication channel with the both of you and discuss things further?

fjcrabiffosse commented 6 years ago

Sorry for the delay, was a bit busy these past days. Steven Black's aggregated hosts files are incredibly useful, I use them myself. Adding the eth phising list to them would be great as it could help protect a much wider user base. I would be happy to collaborate in any way I can help.

kumavis commented 6 years ago

@fjcrabiffosse glad you find the list useful, tho i suggest that whatever tool is updating the list and setting the hosts file should do the conversion from json to hosts.txt format

it doesnt really make sense for us to store and maintain a list in 2 formats

trn1ty commented 2 years ago

@fjcrabiffosse Is there a license on the Python code you shared?

fjcrabiffosse commented 2 years ago

@fjcrabiffosse Is there a license on the Python code you shared?

BSD 3-clause license. Should give you plenty of freedom to do whatever you need to do.

trn1ty commented 2 years ago

Awesome, thanks very much! Was pessimistic about getting a response.