bitcoin-data / mining-pools

Known Bitcoin mining pool coinbase tags and coinbase output addresses. Generated files: https://github.com/bitcoin-data/mining-pools/tree/generated
MIT License
18 stars 12 forks source link

Split-up pools.json file into multiple #62

Closed 0xB10C closed 1 year ago

0xB10C commented 1 year ago

This splits out the large pool.json file into a more maintainable set of entities. One JSON file per pool. Backwards compatibility is provided with the added script contrib/create-old-pools-json.py.

Other formats, as discussed in e.g. https://github.com/0xB10C/known-mining-pools/issues/38#issuecomment-845819260 can be added later if needed.

The new format also allows for easier merging with other pool.json files from forks (btccom and blockchain-info).

Closes #38

0xB10C commented 1 year ago

Python scripts mentioned in the commit messages:

# to-entities.py
# converts pool.json to the file-per-pool structure

import json

names = dict()

with open("pools.json", "r") as f:
    pools = json.load(f)

    for tag in pools["coinbase_tags"]:
        e = pools["coinbase_tags"][tag]
        name = e["name"]
        link = e["link"]
        if name not in names:
            names[e["name"]] = { "links": set(), "addresses": set(), "tags": set() }
        if link != "":
            names[name]["links"].add(link)
        names[name]["tags"].add(tag)

    for addr in pools["payout_addresses"]:
        e = pools["payout_addresses"][addr]
        name = e["name"]
        link = e["link"]
        if name not in names:
            names[name] = { "links": set(), "addresses": set(), "tags": set() }
        if link != "":
            names[name]["links"].add(link)
        names[name]["addresses"].add(addr)

for name in names:
    filename = name.rstrip('.').replace("'", "").replace(" ", "-").replace(".", "-").replace("/", "").replace("&", "").replace("(", "").replace(")", "").lower()
    pool = names[name]

    content = {
        "name": name,
        "addresses": list(pool["addresses"]),
        "tags": list(pool["tags"]),
        "links": list(pool["links"]),
    }

    with open("entities/"+filename+".json", "w") as out:
        json.dump(content, out, indent=2, ensure_ascii=False)
# sort-pools-json.py
# sorts the addresses and tags in pool.json file to make the files comparable
import json

names = dict()

with open("pools.json", "r") as f:
    pools = json.load(f)

    tags = pools["coinbase_tags"]
    addresses = pools["payout_addresses"]

    tags = dict(sorted(tags.items()))
    addresses = dict(sorted(addresses.items()))

    with open("pools-sorted.json", "w") as out:
        content = {
            "payout_addresses": addresses,
            "coinbase_tags": tags
        }
        json.dump(content, out, indent=2, ensure_ascii=False)
Sjors commented 1 year ago

ForkMonitor was fetching this file… Switched to using the self generated version in https://github.com/BitMEXResearch/forkmonitor/commit/bfb0faa2cdf152f0752995550d548a44630e0edc

0xB10C commented 1 year ago

Sorry, wasn't aware. Guess the symlink to generated/pools.json doesn't work for fetching this file..

Sjors commented 1 year ago

It might work if I had fetched it locally, but I was fetching the file from Github.