burib / aws-region-table-parser

converts aws Region Table html page into a json object ( runs every day at 12:00 UTC noon )
MIT License
19 stars 9 forks source link

region table structure is changed. the parsing is broken. #20

Closed burib closed 3 years ago

burib commented 3 years ago

looks like the new url to parse is https://api.regional-table.region-services.aws.a2z.com/index.json which will be much easier since it's just a json so no need to mess around with html parsing. It's going to work easier.. until they change the page again. :)

burib commented 3 years ago

to get all services and their region run:

curl https://api.regional-table.region-services.aws.a2z.com/index.json | jq .prices | jq '[.[] | { region: .attributes["aws:region"], serviceName: .attributes["aws:serviceName"] }]'

need to group and index by serviceName and region. also count the elements

burib commented 3 years ago

grouping can be done

curl https://api.regional-table.region-services.aws.a2z.com/index.json | jq .prices | jq '[.[] | {region: .attributes["aws:region"], serviceName: .attributes["aws:serviceName"] }]' | jq 'group_by(.serviceName)[] | {(.[0].serviceName): [.[] | .region]}'
burib commented 3 years ago

sort services by number of supported regions, index by serviceCode

curl -s https://api.regional-table.region-services.aws.a2z.com/index.json | jq .prices | jq '[.[] | { region: .attributes["aws:region"], serviceName: .attributes["aws:serviceName"], serviceUrl: .attributes["aws:serviceUrl"] | sub("https://"; ""; "g") | sub("aws.amazon.com/"; ""; "g") | sub("www."; ""; "g") | sub(".aws"; ""; "g") | sub("/"; "-"; "g") | .[0:-1] }]' | jq '[
        group_by(.serviceName)[] | { (.[0].serviceUrl): { 
                        name: .[0].serviceName, 
                        regions: [.[] | .region],
                        count: [.[] | .region] | length 
                }
        }
]' | jq 'sort_by(.[. | keys | .[0]].count)' | jq '{services: [.], count: . | length}'