dewi-alliance / dewi-etl

Publicly available Helium blockchain data - https://etl.dewi.org
MIT License
9 stars 1 forks source link

Fetch validator geo & host data using reverse IP lookups #7

Open jamiew opened 3 years ago

jamiew commented 3 years ago

Explorer API is using KeyCDN to fetch geo location, ISP and other information for each validator

This appears to be publicly accessible and would useful to have in DeWi ETL as well:

ip="34.92.38.196"
curl -s "https://tools.keycdn.com/geo.json?host=$ip" -H "User-Agent: keycdn-tools:https://tools.keycdn.com"

=>

{
  "status": "success",
  "description": "Data successfully received.",
  "data": {
    "geo": {
      "host": "34.92.38.196",
      "ip": "34.92.38.196",
      "rdns": "196.38.92.34.bc.googleusercontent.com",
      "asn": 15169,
      "isp": "GOOGLE",
      "country_name": "Hong Kong",
      "country_code": "HK",
      "region_name": "Central and Western District",
      "region_code": "HCW",
      "city": "Central",
      "postal_code": null,
      "continent_name": "Asia",
      "continent_code": "AS",
      "latitude": 22.2908,
      "longitude": 114.1501,
      "metro_code": null,
      "timezone": "Asia/Hong_Kong",
      "datetime": "2021-07-21 21:00:39"
    }
  }
}

We should investigate if a commercial license is strictly required, or if this is OK for the ~2000 queries or so we would need for initial import

dansku commented 3 years ago

This was implemented in the go-shenanigans https://github.com/dewi-alliance/go-etl-shenanigans/blob/main/scheduler/validator_isp.go

jamiew commented 3 years ago

Amazing, love it

Table name is validator_isp

Here's the create statement I dug up from postgres... I think we will need geo_data to be a jsonb

CREATE TABLE public.validator_isp (
    address text NOT NULL,
    isp text,
    geo_data text
);

ALTER TABLE ONLY public.validator_isp
    ADD CONSTRAINT validator_isp_pkey PRIMARY KEY (address);
jamiew commented 3 years ago

Would it be annoying to parse the JSON and include each column individually? It's possible to join on jsonb fields with raw SQL but not via Metabase's nice GUI tools, so I'd likely end up creating a view that does the extraction if we don't do it here


{
  "host": "100.121.169.153",
  "ip": "100.121.169.153",
  "rdns": "100.121.169.153",
  "asn": 0,
  "isp": "",
  "country_name": "",
  "country_code": "",
  "region_name": "",
  "region_code": "",
  "city": "",
  "postal_code": "",
  "continent_name": "",
  "continent_code": "",
  "latitude": 0,
  "longitude": 0,
  "metro_code": 0,
  "timezone": "",
  "datetime": ""
}```
dansku commented 3 years ago

Added all the fields to its own columns.

dansku commented 2 years ago

@jamiew this is also done afaik