Closed fjcrabiffosse closed 6 years ago
Hi @fjcrabiffosse - sure if you could write a script to convert our format into a typical hosts
file, we can then put it in the CI to rebuild on every PR.
Hi @409H
The following python script should do.
import json
def metamask2json(input_file_path, output_file_path):
"""
Parses the metamask config.json and outputs the list of blacklisted domains in a hosts file format.
:param input_file: A json file containing a 'blacklist' key with a list of blacklisted domains.
:return: Writes to the output_file_path the blacklisted domains using a hosts file format.
"""
with open(input_file_path) as json_file:
domains = json.load(json_file)
blacklist = domains['blacklist']
hosts_list = [('127.0.0.1' + ' ' + 'localhost' + '\n')] + ['0.0.0.0' + ' ' + blacklisted + '\n' for blacklisted in blacklist]
with open(output_file_path, 'w') as hosts_file:
hosts_file.writelines(hosts_list)
if __name__ == '__main__':
metamask2json('./config.json', './hosts.txt')
Let me know if you would like any changes.
Nice! I'll see if I can get this out of the door by Monday (January 15) - I'll keep this issue open until it's done.
Thanks! You are doing an awesome job for the community, happy to help.
@fjcrabiffosse Hi, sorry, I was busy all weekend. Will bump it up the priority list for this week, apologies.
Hi!
I understand this is a low priority thing, so don't feel much rush for it. Did you encounter any problems adding the script to the CI pipeline? I think providing your blacklist in other formats is still valuable to the community but if you don't want to integrate this export in your CI maybe we can think of other ways of supporting it. Any ideas? Best regards!
Hi @fjcrabiffosse - sorry for the delay, I'm bumping this up the priority list.
@fjcrabiffosse I've committed an up-to-date version of the hosts file (https://github.com/MetaMask/eth-phishing-detect/blob/master/src/hosts.txt) as a .txt
file so you can pull down and parse.
This will get the ball rolling and I've made a note to put it into the CI or work on some automation for this.
Sometimes I'm not around my dev environment so I can't run the python script manually, but will run it once a day until I can automate this.
Awesome!
Here is a snippet of uMatrix using your malware list.
@fjcrabiffosse Awesome!
Steven Black (in this issue for a similar list I maintain - https://github.com/409H/EtherAddressLookup/issues/203) is consuming and putting into a hosts file on his repo - https://github.com/StevenBlack/hosts/blob/master/readme.md
Perhaps we could open a communication channel with the both of you and discuss things further?
Sorry for the delay, was a bit busy these past days. Steven Black's aggregated hosts files are incredibly useful, I use them myself. Adding the eth phising list to them would be great as it could help protect a much wider user base. I would be happy to collaborate in any way I can help.
@fjcrabiffosse glad you find the list useful, tho i suggest that whatever tool is updating the list and setting the hosts file should do the conversion from json to hosts.txt format
it doesnt really make sense for us to store and maintain a list in 2 formats
@fjcrabiffosse Is there a license on the Python code you shared?
@fjcrabiffosse Is there a license on the Python code you shared?
BSD 3-clause license. Should give you plenty of freedom to do whatever you need to do.
Awesome, thanks very much! Was pessimistic about getting a response.
The whitelist and blacklist you maintain is very useful but is maintained in a format that doesn't allow its reusability with other extensions. I think it would be very beneficial for the community if you could provide the blacklist file using also a typical HOSTS.txt file format (e.g. http://winhelp2002.mvps.org/hosts.txt). That way the list could be integrated in other extensions (e.g. uMatrix) or blocked at OS level.
I could prepare a quick and dirty script to export your JSON file to said format but the interesting part would be if the list is actively maintained and updated with your new additions. Do you find this idea interesting?
Thanks!