habib-university / Class-Teams

2 stars 5 forks source link

Team Selector

I need to make student teams of 2 in my classes for which I ask students to specify greens and reds. Greens are people that the student wants to work with and reds are people that they do not want to work with. This program matches students according to these preferences.

How to Use

Enter your data in the file, preferences.csv. A sample is provided. The details of the required format are given below.

The first row contains titles. The first title indicates a student and may be left blank. The remaining titles are any number of green and red in any order.

Each remaining row contains the preferences of an individual student, the source for that row. The name of the source appears in the first column. Each cell in a column titled green contains the name of a different student that source wants to work with. Each cell in a column titled red contains the name of a different student that source does not want to work with.

Different students may have different numbers or greens and reds. Leave cells blank for missing data. The program accounts for all students indicated as a source, a green, or a red. Students who have not been indicated as either of these but still need to be matched should be added in the file as sources with no data for greens or reds.

Make sure to use a unique name for each student and to use the name consistently. Take care to not mistype it as the program currently cannot account for typos.

The set of students indicated as sources, greens, and reds must have an even size.

The Computed Matching

The program matches each student according to their preference. The set of all matches is called a matching. The program computes many matchings and outputs the one with the highest goodness score which is between 0 (worst) and 1 (best). The matching score is a measure of the number of students in the matching who are matched with a green. If no matching can be found, the output score is negative.

Because the programs makes matches of 2, the total number of specified students must be even.

As it is usually impossible to meet all preferences, a score of 0.6 or above is good enough.

The Visualization

The program outputs a graph visualizing the preferences.

Requirements

To run the program,

How it Works

The problem is modeled as a directed graph with every student as a node, green edges from a student to their greens and red edges from a student to their reds. The program tries to partition the vertices into pairs or matches such that no match contains a red and as many greens are included in the matches as possible.

The program builds a matching by iteratively picking a random node as the source and building a match with one of its greens such that source is not a red of the green. If a match cannot be found among the greens, nodes not connected to source are searched for matches. If still no match is found, the matching fails. Once a match is found, source and its match are added to the matching and removed from the graph.

As random choices are involved, the above process is repeated several times to maximize the likelihood of a high scoring pairing. The highest scoring pairings (there may be multiple) are saved.

Visualization

The program saves and launches a visualization of the graph. The generated file name is preferences.pdf.

Sample Output

Provided that the requriemetns are met, the given program shoud run with the given preferences.csv file as-is. The produced output will be of the form

>>> main()
Read preferences of 10 students with up to 3 greens and 1 reds.
progress: [########################################]
(P, T), (Q, X), (R, V), (S, W), (U, Z)
0.875

Limitations

Extensions

This is probably a variant of the vertex cover problem. If so, a more efficient and theoretically sound solution may exist.

Bugs

The program fails or crashes? You have found an error? You want to know more? This documentation sucks?

For the above or any other issues, please report