bitcoin-data / mining-pools

Known Bitcoin mining pool coinbase tags and coinbase output addresses. Generated files: https://github.com/bitcoin-data/mining-pools/tree/generated
MIT License
18 stars 12 forks source link

Sync with mempool space data #69

Closed 0xB10C closed 1 year ago

0xB10C commented 1 year ago

As mentioned in https://github.com/mempool/mining-pools/issues/25, it makes sense to deduplicate the efforts of maintaining separate pools.json files. This PR syncs https://github.com/mempool/mining-pools/blob/master/pools.json with the entities files here.

First, I've sorted and cleaned up the links and addresses in the current entities. The script for this is in the commit message.

Then, with the script below, I've converted the concatenation of our generated pools.json and the mempool-space pools.json to entity-json files. I've manually picked the changes that I could verify. I've left out the addresses from e.g. AntPool that never received from a coinbase output and generally cleaned up where I saw something.

import json

names = dict()

with open("mempool-space.json", "r") as f:
    pools = json.load(f)

    for tag in pools["coinbase_tags"]:
        e = pools["coinbase_tags"][tag]
        name = e["name"]
        link = e["link"]
        if name not in names:
            names[e["name"]] = { "links": set(), "addresses": set(), "tags": set() }
        if link != "":
            link = link.rstrip("/").replace("http://", "https://")
            names[name]["links"].add(link)
        names[name]["tags"].add(tag)

    for addr in pools["payout_addresses"]:
        e = pools["payout_addresses"][addr]
        name = e["name"]
        link = e["link"]
        if name not in names:
            names[name] = { "links": set(), "addresses": set(), "tags": set() }
        if link != "":
            link = link.rstrip("/").replace("http://", "https://")
            names[name]["links"].add(link)
        names[name]["addresses"].add(addr)

for name in names:
    filename = name.rstrip('.').replace("'", "").replace(" ", "-").replace(".", "-").replace("/", "").replace("&", "").replace("(", "").replace(")", "").lower()
    pool = names[name]

    content = {
        "name": name,
        "addresses": sorted(list(pool["addresses"])),
        "tags": sorted(list(pool["tags"])),
        "links": sorted(list(pool["links"])),
    }

    with open("entities/"+filename+".json", "w") as out:
        json.dump(content, out, indent=2, ensure_ascii=False)
        out.write('\n')
0xB10C commented 1 year ago

Double-checked these. I think this is ready to go