daffidwilde / matching

A package for solving matching games
https://daffidwilde.github.io/matching/
MIT License
150 stars 42 forks source link

Error in HospitalResident when capacity is zero #112

Closed raos-projects closed 3 years ago

raos-projects commented 4 years ago

Hello! Thank you for creating the matching package. It's very easy to use!

I'm implementing a HospitalResident matching game in which a hospital may have a capacity of zero. My expectation is that if the hospital's capacity is zero, the algorithm will simply not assign any residents to the hospital. But when I try the following code, which is a modified version of the example code in the docs, it raises an IndexError:

from matching.games import HospitalResident
import numpy as np

np.random.seed(0)
num_residents, num_hospitals = 400, 20

resident_prefs = {
    r: np.argsort(np.random.random(size=num_hospitals))
    for r in range(num_residents)
}
hospital_prefs = {
    h: np.argsort(np.random.random(size=num_residents))
    for h in range(num_hospitals)
}
#generate first game with capacities ranging from 1 to 9
capacities = {h: np.random.randint(1,10) for h in hospital_prefs}
game = HospitalResident.create_from_dictionaries(
    resident_prefs, hospital_prefs, capacities
)
_game1 = game.solve() #runs without errors

#generate second game with capacities ranging from 0 to 9
capacities = {h: np.random.randint(0,10) for h in hospital_prefs}
game = HospitalResident.create_from_dictionaries(
    resident_prefs, hospital_prefs, capacities
)
_game2 = game.solve() #error happens here

The error traceback is:

Traceback (most recent call last):
  File "C:/Users/.../matching_test.py", line 27, in <module>
    _game2 = game.solve()
  File "C:\...\Anaconda3\lib\site-packages\matching\games\hospital_resident.py", line 73, in solve
    hospital_resident(self.residents, self.hospitals, optimal)
  File "C:\...\Anaconda3\lib\site-packages\matching\games\hospital_resident.py", line 272, in hospital_resident
    return resident_optimal(residents, hospitals)
  File "C:\...\Anaconda3\lib\site-packages\matching\games\hospital_resident.py", line 315, in resident_optimal
    successors = hospital.get_successors()
  File "C:\...\Anaconda3\lib\site-packages\matching\players\hospital.py", line 68, in get_successors
    idx = self.prefs.index(self.get_worst_match())
  File "C:\...\Anaconda3\lib\site-packages\matching\players\hospital.py", line 63, in get_worst_match
    return self.matching[-1]
IndexError: list index out of range

My guess is that this error occurs because the matching list for a particular hospital has a capacity of zero (i.e. looks like []), and therefore any attempt access an element via index will throw an error.

I've seen the solution to Issue #68 and have considered cleaning resident ranklists to remove hospitals with zero capacity. While that would work instrumentally, one of my goals is to see how far down the ranklist a resident may fall. For example, if they ranked a program that had zero capacity as their first choice, but then matched to their second choice because their first choice wasn't accepting residents, I want to record that as matching to their second choice and not their first (which would happen if I removed the hospital with zero capacity from their ranklist).

I'm thinking of a workaround which involves maintaining a resident's 'true' ranklist while submitting a 'cleaned' ranklist to the algorithm, then running the match algorithm, and then finding the position of the matched hospital in the resident's true ranklist to determine which rank was matched. But this seems less organic than a behavior which essentially ignores hospitals with zero capacity. Thoughts?

daffidwilde commented 4 years ago

Hi @raos-projects, thanks for opening this issue.

This kind of behaviour should have been caught by HospitalResident._check_inputs() but I'll open a PR fixing that now and make a new minor release.

I'm thinking of a workaround which involves maintaining a resident's 'true' ranklist while submitting a 'cleaned' ranklist to the algorithm, then running the match algorithm, and then finding the position of the matched hospital in the resident's true ranklist to determine which rank was matched.

This is certainly what I would recommend doing right now if you wanted to recover the "true" rank of a resident's match in a scenario with zero-capacity hospitals. I appreciate that is a bit janky but the way matching is implemented (subject to that PR) is to solve valid game instances, i.e. instances that meet the game requirements.

But this seems less organic than a behavior which essentially ignores hospitals with zero capacity.

This has given me something to think about in terms of the user interface for the library as it matures. Maybe this kind of cleaning could be done in-house, tying in with #89, and give warnings back rather than raising errors.

raos-projects commented 4 years ago

Thank you for the prompt response, this makes a lot of sense! Glad to know that this functionality is expected given the game requirements.

I've implemented the workaround and everything is just fine.