List of route icons at each grade has gone screwy, related to stars changes

scd commented 7 years ago

feedback from user

I've noticed strange behavior with how routes are listed as icons, as evidenced by what is seen at the NSW level of the index. Many of the icons seem out of place:

"Montys slot" grade 6, 1 ascent (looks worthless really) "Lips" grade 11, 2 ascents (looks worthless really) "Cornerstone Rib Original" grade 8, 2 ascents (far inferior to the normal cornerstone rib) "Bubblegum beach", grade 14, 1 ascent (looks poor) "West wall", grade 12, 156 ascents (banned, 2 ascents in 2 years)

I had thought that Icon routes were calculated based on popularity and number of ascents, however somehow the algorithm seems to be weighting towards routes with a 100% rating due to a single well rated ascent.

This may be as you expect, in which case sorry to bother you, just thought it might be worth a look.

scd commented 7 years ago

https://www.thecrag.com/climbing/australia/new-south-wales-and-act

My guess is that it looks like our recent change in how routes are starred with no minimum ascents has effected the scoring.

DaneEvans commented 7 years ago

I believe it's the expected behavior, but even just a minimum ascent threshold (5?10?) would help.

That list also has some interesting capitalization. Sometimes it displays correctly, others it's lowercase for a climb, despite the climb title itself being capitalised correctly. I haven't managed to narrow it down further... may be related to whether you are at an area/crag level lowercase: https://www.thecrag.com/climbing/australia/pierces-creek-bouldering/area/11833099 Correct: https://www.thecrag.com/climbing/australia/pierces-creek-bouldering/area/825934752

brendanheywood commented 7 years ago

So some thoughts here: Do we a) make additional tweaks changes to the recently changed star rating system, or b) just make changes to the icons system to make it take into account volume, or both?

@scd is the stars article actually accurate for the current new stars algorithm?

https://www.thecrag.com/article/Stars

Every time I touch this I have to re-learn what we are currently actually doing :)

The recent discussion here makes me think that we should actually make a small tweak to the stars, so that volume is a weak signal, but also make volume a stronger signal in the icon'icness rating

https://www.thecrag.com/discussion/1026314604/clean-cut-walls--can-i-donate-towards-getting-other-routes-bolted-like-the-easy-stuff--clean-cuts

If we only touch the iconicness algorithm then I think a simple down grade of the stars based on volume would suffice, eg if there has been < 10 ticks then remove 2 stars, if < 100 then remove 1 star. But I'd actually make this a smooth downgrade curve something like downgrade = max(2 - log10 (ticks)), 0). ie once you are over 100 ticks you are working purely on aggregate stars.

@scd what do we actually call 'iconicness' under the hood? Can we search on this in the facets? When I see the list of iconic climbs at a node, I kinda want to see what the next X most iconic ones are...

brendanheywood commented 7 years ago

This seems to have resolved itself?

brendanheywood commented 7 years ago

There is still the outstanding capitalisation part of this issue:

I've seen this in a bunch of places and still don't understand how this happens?

brendanheywood commented 7 years ago

I don't think this is resolved, gara is still a complete mess and providing just plain weird results. It ignores multiple 3 star classics that have lots of ascents and instead shows problems that have never been repeated.

See also this forum where I list what I think gara's icons for bouldering would be as a comparison point:

https://www.thecrag.com/discussion/1244829924/upper-gara-gorge--grading-around-gara

rouletout commented 3 years ago

The Gara example seems ok, Hope has 37 ascents, this is why it is there. Closing this until further evidence of a problem is presented.

brendanheywood commented 3 years ago

The gara example is still a perfect example of how wrong the current algorithm is. The problem is not Hope, it is everything else in the list, and everything else which is not in the list and should be. Below is the icons right now in a gorge which has 640 routes and boulder problems, and 1788 ascents. And yet despite all that data it comes up with totally unexpected icons, and only 4 icons when it could have icons a much large subset of the entire grade spectrum. If you asked any local what their gut feel would be for iconic routes, none of these would come close except for Hope.

https://www.thecrag.com/en/climbing/australia/gara-gorge/upper-gara-gorge

Of those 4 route icons, 2 of them Elver Escape and Confined have never been repeated. Sloper traverse has only had 3 ascents. They might be decent enough routes but how it is possible that a route which has never been repeated could possibly be considered an icon, over other routes which have had dozens of consistent ascents for decades at a high rating?

I also don't think it's useful to have icons be a mix of routes and problems, it should be one or two lists for the most dominant gear type for a given area. If I asked around here I'd probably come up with a list like this for routes and problems:

16 Hope 17 Layabout 18 Anticipation 19 Heavy Metal 20 Heavy metal direct start 21 Faith 22 Savage Amusement

V0 Prayer Note V1 Unknown flake V2 Quickdraw traverse V3 Pseudoephedrine V4 Mozzie rock V5 Evans stone V6 Vulcanology

There will of course be personal preferences but everyone would agree that the ones in this list above are at least in the top 3 at each grade. The original list bears no resemblance to this at all and is essentially useless for someone visiting and wanting a quick taste of the best routes.

And this isn't a problem isolated to Gara, Ebor is the same, 3 of its 4 icons have had only 1 or 2 ascents, when there are trade route classics with dozens of ascents which should be on the list instead:

https://www.thecrag.com/en/climbing/australia/ebor-gorge

Every local crag I looked at is the same, the icons are just pretty useless. At super popular crags like Frog Buttress things still seem off but less so, so I suspect the algorithm is just not performing well at small and medium scale, and its only coming good at very large crags where good routes (even if not the actual most iconic) float to the top regardless of how you rank them.

At the moment the algorithm is a bit of a black box and clearly sub optimal. If this was built from scratch the way I'd approach it is:

1) look at the distribution of gear types and figure out if 1 or 2 lists makes sense. Probably no more than 2 lists per crag. If it is bouldering and climbing then combine sport+trad into a second list. 2) look at the distribution of ascents and routes grades for each list and figure out the spread which covers the most, eg a crag like Coolum will be from grades 25-32 while Frog would be say 16-25. You probably don't want more than say a dozen routes in any given icon list. If there are 2 lists you could tighten them both up a little, or make the more dominant gear style list longer 3) have a clear metric which can be used to rank routes and be able to sort by this in the facets. You should be able to ask 'what are the top 10 most iconic routes at grade V2-4 at crag X' 4) each icon is the top route at that grade in the list. Link to the facet in the icons list so you can see drill down into what are the top N most iconic routes at a given grade.

killakalle commented 3 years ago

Ranking items by star ratings is also discussed in various places on the web, e. g.

https://www.evanmiller.org/ranking-items-with-star-ratings.html

I've looked into this briefly during a small project of mine. Overall I think this is not a trivial topic to get it right.

brendanheywood commented 3 years ago

The main problem is the FA 'froth' factor: everyone thinks their own new problem is the best and they give it lots of stars and it only has 1 ascent. Over time with more ascents the quality rating settles down to a more realistic level.

The main issue is that volume of ascents is seemingly not a factor in the algorithm. I think that what happened was that in the past volume was a factor in the quality rating but then it was simplified to the '1 climber 1 vote' (https://github.com/theCrag/website/issues/1511 and https://github.com/theCrag/website/issues/560). I've manually done a couple spreadsheets for selected grades using a super simple formula of quality X ascents. To make sure I didn't skew the results doing this by hand (I only did a tiny subset of routes) I made sure to include the most popular but lowest rated problems in each batch.

This very naive approach gives what I consider to be near perfect results. The only difference from the gut feel list I made on the top of my head above was 'The crack' now edges out over 'Pseudoephedrine' and I agree with this in hindsight, but the fact that Pseudoephedrine was such a close second only further shows this metric is pretty close.

V3

Quality	Name	Ascents	Iconicness
83	The crack	17	1411
67	Pseudoephedrine	18	1206
64	Teddy bears picnic	15	960
78	Dynosaur	12	936

V4

Quality	Name	Ascents	Iconicness
71	Mozzie rock	25	1775
75	Catacomb roof	11	825
89	Quickdraw roof	7	623
83	3 star Dyno	7	581
83	Shwarma	6	498

V5:

Quality	Name	Ascents	Iconicness
92	Evans stone	12	1104
94	Swinger club	7	658
92	Alopcecia	6	552
83	50 cals	6	498
83	Self service	6	498

V6

Quality	Name	Ascents	Iconicness
77	Vulcanology	12	924
83	Alan flake unknown	6	498
83	Jugtastic	6	498

rouletout commented 2 years ago

+1 : Input from another user: https://www.thecrag.com/discussion/6127612239/website-suggestions

theCrag / website

List of route icons at each grade has gone screwy, related to stars changes #2466