daffidwilde / matching

A package for solving matching games
https://daffidwilde.github.io/matching/
MIT License
152 stars 42 forks source link

Possible bug or my misunderstanding of the algorithm #46

Closed sergboec closed 5 years ago

sergboec commented 5 years ago

Hello I would like to describe my problem as a code

users:
{'user_0': ['a', 'b'], 'user_2': ['c'], 'user_3': ['a', 'd'], 'user_1': ['b', 'c']}

hospitals:
{'a': ['user_0', 'user_3'], 'b': ['user_0', 'user_1'], 'c': ['user_2', 'user_1'], 'd': ['user_3']}

capacities:
{'a': 2, 'b': 2, 'c': 2, 'd': 2}

HospitalResident.create_from_dictionaries(users, hospitals, capacities)

game.solve().values()

([[user_1], [user_2], [user_0, user_3], []])

I do not understand why: user_1 and user_2 are not attached to the same hospital? Is there any way to achieve such behaviour?

daffidwilde commented 5 years ago

Hi there, thanks for raising this issue. I’m away on holiday at the moment but I’ll be sure to have a look at this when I’m back at my machine next week!

daffidwilde commented 5 years ago

Hello again, @sergboec. I've taken the time to look over this issue and I don't believe there is an issue with the implementation in this example.

When I run the same example with v1.1, I get the following:

>>> from matching.games import HospitalResident

>>> resident_dict = {'user_0': ['a', 'b'], 'user_2': ['c'], 'user_3': ['a', 'd'], 'user_1': ['b', 'c']}
>>> hospital_dict = {'a': ['user_0', 'user_3'], 'b': ['user_0', 'user_1'], 'c': ['user_2', 'user_1'], 'd': ['user_3']}
>>> capacities = {h: 2 for h in hospital_dict}

>>> game = HospitalResident.create_from_dictionaries(resident_dict, hospital_dict, capacities)
>>> solution = game.solve()

>>> print(solution)
{a: [user_0, user_3], b: [user_1], c: [user_2], d: []}

I believe this is the same solution you see. In this case, I would expect users 1 and 2 to end up in different hospitals. Here, users 0 and 3 end up in hospital A because there is enough space to accommodate them, and they both get their first choice. Likewise, users 1 and 2 get their first choices which are different.

The run of the algorithm is as follows:

  1. Without loss of generality, user 0 is chosen first. Their favourite hospital is A. Now 0 and A are matched. Since A is not over-subscribed or subsequently full, the iteration is over. Current matching: (0, A).
  2. WLOG, user 2 is chosen next. Their favourite hospital is C. Now 2 and C are matched. Since C is not over-subscribed or subsequently full, the iteration is over. Current matching: (0, A), (2, C).
  3. WLOG, user 3 is chosen next. Their favourite hospital is A. Now 3 and A are matched. C is not over-subscribed but they are now at capacity. Their least favourite match is user 3 with no successors. Hence, the iteration is over. Current matching: (0, A), (2, C), (3, A).
  4. User 4 is chosen next as the only remaining unassigned resident with a non-empty preference list. Their favourite hospital is B. Now 4 and B are matched. Since B is not over-subscribed or subsequently full, the iteration is over. Final matching: (0, A), (2, C), (3, A), (4, B).

As far as your second question goes, the answer is unfortunately not. The purpose of the HR algorithm is to achieve a resident-optimal matching that is stable rather than to minimise the number of hospitals used whilst keeping residents happy. It is possible to run the algorithm to be hospital-optimal but in this case, the solution is the same. I can see why for some applications having that kind of behaviour would be useful but it is beyond the scope of this library (for now at least). Is there any particular reason you would like to see this behaviour?

I hope this is helpful and please let me know if you have any questions about my answer. Thank you again for raising this issue.

Cheers,

Henry

sergboec commented 5 years ago

Hello @daffidwilde

Thank you for your patient and clear reply.

Is there any particular reason you would like to see this behaviour?

Yep. The actual problem i'm trying to solve states that:

"We have an group of users and each user have their array of interest." I'm trying to find as many matches by interests as it's possible for all users and create pairs from them I've read a little bit more about library and now i'm not thinking that it can helps

Thanks a lot,

Sergey