hack4impact-uiuc / h4i-recruitment

H4I recruitment platform
https://h4i-recruitment.vercel.app
16 stars 3 forks source link

How do we eliminate algorithmic bias in matches #14

Open tko22 opened 6 years ago

tko22 commented 6 years ago

some candidates may have more matches or may have matched against "not as great" candidates more. How do eliminate this? Round robin could be it, but that's a lot of matches and is not practical.

shreyshrey1 commented 6 years ago

Could you just use a bubble sort like algorithm where you switch two candidates rankings if one is better than the other? And if you keep doing this all the good candidates will bubble to the top?

davidachang commented 6 years ago

My initial thought is that I don't like that. Seems like there are two ways to generally do bubble sort - either legit bubble sort and continuously compare candidates who are "next" to each other - that takes a fuck ton of matches, how do you know when everything is finally "sorted", and if you stop half way (don't get enough sorts) you probably are left with a shittily sorted list. Other way is to randomly select two candidates and swap the two rankings, which is also kind of scary bc then you could theoretically have first swap with last. Those are my concerns, maybe there'ss a way to get around - elo system???

Imo random candidate selection with a shitty ranking system (+1 for win, -1 loss) is probably the easiest, and completely random. Could do a more legitimate elo ranking system where candidates get matched with similarly ranked candidates only (within plus or minus 5 of their current ranking or something).

On Mon, Jul 9, 2018 at 8:08 AM Shreyas Mohan notifications@github.com wrote:

Could you just use a bubble sort like algorithm where you switch two candidates rankings if one is better than the other? And if you keep doing this all the good candidates will bubble to the top?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/hack4impact-uiuc/h4i-recruitment/issues/14#issuecomment-403512211, or mute the thread https://github.com/notifications/unsubscribe-auth/ASj8qjPKTk2qiV_5E_W2hovseT93o2SOks5uE3HXgaJpZM4UqXKg .

shreyshrey1 commented 6 years ago

I think we could definitely do a simple elo rating. https://www.geeksforgeeks.org/elo-rating-algorithm/

angadgarg25 commented 6 years ago

Yeah I also agree that any sort of comparison sorting would not work since with people using facemash total order is very unlikely (a > b and b > c implying a > c). If we knew this property would always be satisfied than a quicksort/mergesort approach would work and be performant.

I think elo would work as a ranking but we have to consider matchmaking after candidates already have scores. Elo rating is often paired with the matchmaking aspect, those with similar elos are paired off against each other. This would favor people who did well early on regardless of their true ranking. The matchmaking aspect of elo is ideal for a large population and/or a large number of matches per person, neither of which we have. However elo scoring with random matchmaking is a good option, but wouldn't have amazing accuracy.

The other option is modeling some sort of sports bracket. This problem most closely resembles that of a sports bracket, however I would rather avoid this for 2 reasons. First, this is extremely unforgiving, and is only somewhat accurate for the winner of the bracket (not even necessarily completely accurate). Second, this completely eliminates any randomness after the beginning (once the bracket is decided we need the exact data from each matchup). This poses an issue because the client will always get a match after submitting one, whether the user plans to submit another match or not. This would break unless we significantly changed the structure of the facemash page.

TLDR: so far Elo scoring with random matchmaking is seeming like the best option

davidachang commented 6 years ago

agree with @angadgarg25's final conclusion

rgychiu commented 6 years ago

Another thing that we'll have to figure out with the elo ranking system is the scoring constant. That way a pick with two similar candidates after a large amount of random matchmaking still gives the winning candidate a chance to rank higher than the 'better' candidate that lost the match. In very early matchmaking this might also be important in cases of a small candidate pool combined with a large disparity in elo.

tko22 commented 6 years ago

one thing, random matchmaking requires a lot of matches to give each candidate an equal amount, which we wouldn’t have. How can we ensure that candidates are being matched equally, as in the amount of matches they get put in?

I think as long as we are somewhat random in our matchmaking and we aren’t too skewed in in the # of matches per candidate, it should be okay. Although, we should take this into consideration.

davidachang commented 6 years ago

No science here - could potentially do +1 for candidates who beat an equally/lower ranked and -1 for opponent, +2 for candidates who beat a higher ranked and -2 for opponent LOL

I really don't think it is necessary though. I think what we do instead is make this super simple to do quickly, and ensure that we can reasonably do a comparison a minute(?) without sacrificing quality of comparisons. Then we say 10 people do 100 each and you got 1000 comparisons total which means hopefully 100 candidates got compared ~20 times each. Oh god is that enough.

Us doing a fuck ton of comparisons is necessary regardless of our end method tho. We will sit down in a room for a couple hours and just facemash I guess. I'll bring some snacks. Make a sweet user experience pls.

You should aim to slap something together real quick that looks not pretty but is functional and see if we can use that easily at least two weeks before applications roll in (roughly start of school).

tko22 commented 6 years ago

It's functional, besides actually ranking candidates and creating a better match maker. And it lays out the 2 candidates' information side by side. It's pretty simple. What makes this much more successful is a balanced and fair matchmaking and ranking.

angadgarg25 commented 6 years ago

Agreed that we need to ensure candidates have a pretty much equal amount of matches at the end. Maybe lets do this:

  1. Randomly select a candidate from the subset of candidates with the lowest number of matches so far.
  2. Figure out subset of candidates who have not matched with candidate 1.
  3. Randomly select candidate 2 from the subset of the set from step 2 with the lowest number of matches so far. This is still pretty random but ensures that matches per person will remain relatively close (I think this would result in at most 2 apart at any given time). It also ensures that we don't get duplicate matches.

We will want error handling at step 2 (if the candidate has matched with all others) even though in our case this wouldn't be realistic. For this same reason we could make allowing duplicate matches as a feature we can switch on and off, but leave it off always for our app (if everyone matches with everyone else and there are no duplicate matches, the endpoint would fail eventually). While we would most likely not use this, it would make this code more adaptable. Just a thought though, we can leave this out since we would never actually need it.