seansaito / Roster

A Tor project that awards points and badges to relay operators based on the performance of their families
5 stars 1 forks source link

New Badges: "org_rarity/guard_relays" and "org_rarity/exit_relays" #13

Closed virgil closed 8 years ago

virgil commented 8 years ago

This ticket fully supersedes the AS-rarity proposal. We can do much better than this using the data from:

We can resolve each AS to an organization_id. We want to have a badge for organization rarity.

In short, take each AS, get it's org_id, and then make a histogram of the org_ids. Reward operators with uncommon org_ids.

Here's the data that's accurate as of January 2016. For now don't worry about this data being updated in the field.

virgil commented 8 years ago

There are more clever diversity measures between sets of ASs using things like customer-cones-ratios, http://as-rank.caida.org/?mode0=as-intro#customer-cone , but this simple organization diversity will be sufficient for now. If we finish early we can add a better diversity measure.

seansaito commented 8 years ago

By uncommon, say < 10 relays that have the particular org_id?

seansaito commented 8 years ago

More AS's share a common AS org_name than AS org_id. So I shall stick with org_ids

seansaito commented 8 years ago

More over, there is a small set of AS numbers that are not included in the above json:

202018 200651 394362 133165 197216 198414 202018 196682 200651 197226 133165 197019 202109 393406 201229 200130 198310 200130 201229 197988 197922 202109 197999 133165 200185 133165 200130 133165 393406 201229 200130 393406 202018 133165 393406 200651 200130 196689 201133 198385 202109 197328 196750 394362 202109 133165 200130 393406 197540 197540 133165 200130 198031 200130 198599

virgil commented 8 years ago

I mean divide the org_id's into quintiles and give badges for those running delays in the top (least common) 4 quintiles.

So every relay except those existing within the 20% most popular org_ids will be getting badges.

-V

On Wed, 20 Jan 2016 at 00:44 Sean Saito notifications@github.com wrote:

By uncommon, say < 10 relays that are within the AS?

— Reply to this email directly or view it on GitHub https://github.com/seansaito/Roster/issues/13#issuecomment-172912794.

virgil commented 8 years ago

Treat these ASs as separate org_ids.

So for example, the dictionary entry for AS 202018 would be:

"202018" : ("AS202018", "Unknown Org/AS202018")

Then just treat everything else as normal. Likely these missing ASs will do very well by our diversity measure.

-V

On Wed, 20 Jan 2016 at 01:51 Sean Saito notifications@github.com wrote:

More over, there is a small set of AS numbers that are not included in the above json:

202018 200651 394362 133165 197216 198414 202018 196682 200651 197226 133165 197019 202109 393406 201229 200130 198310 200130 201229 197988 197922 202109 197999 133165 200185 133165 200130 133165 393406 201229 200130 393406 202018 133165 393406 200651 200130 196689 201133 198385 202109 197328 196750 394362 202109 133165 200130 393406 197540 197540 133165 200130 198031 200130 198599

— Reply to this email directly or view it on GitHub https://github.com/seansaito/Roster/issues/13#issuecomment-172932318.