moest-np / center-randomize

Script is to assign exam centers to students
MIT License
752 stars 359 forks source link

Great effort! Here are some edge cases you might need to consider in future. #2

Closed sumitto closed 6 months ago

sumitto commented 7 months ago

Hi, this is a great effort to have an open sourced algorithm for doing things like this. I hope you continue to open source any other problems/solutions so that more people might be able to take a look at them.

On a quick look, I didn't find any big issue. I am just going to mention some edge cases that I thought of while going through the code. You may need to account for these situations (in future) if you haven't already done so. Piloting this in Kathmandu, itself shouldn't be problematic but as you expand this in future these cases might occur more frequently. Case 3. and 4. might be applicable in Kathmandu as well.

  1. Distance calculation with latitude, longitude doesn't always equal to distance students need to travel by road. Example: For places in remote areas, where there is no accessible bridge available and the center happens to be on the other side of river, closest by km might not be easily accessible. Maybe this can be handled by setting up prefs code?

  2. If no centers are within the absolute distance threshold, it always chooses the same center(the closest one) Example: If school A doesn't have any center within threshold (7km). But has 5 centers near threshold but slightly different distance like 7.1, 7.2 km etc. This doesn't choose different center every year. In this case, there is no check for PREF_CUTOFF either. So even if it was problematic last year, same center will still be chosen.

  3. Not sure if its possible for students to be not from any school(eg: homeschool). In which case they might need special center allocation.

  4. Centers assignment for students with special needs might need to be handled separately.

sebakthapa commented 7 months ago

I have a simple solution idea for CASE-2.

Existing Logic

  1. Searches for centers within the PREF_DISTANCE_THRESHOLD
  2. If doesn't find a center within the PREF_DISTANCE_THRESHOLD, then select the closest center.

New Logic

  1. Searches for centers within the PREF_DISTANCE_THRESHOLD
  2. If doesn't find a center within the PREF_DISTANCE_THRESHOLD, increase the threshold by a certain amount (x) and rerun until centers are allotted for all students.

What does it solve?

  1. But has 5 centers near threshold but slightly different distance like 7.1, 7.2 km etc. This doesn't choose different center every year

    This is solved as the value of x is to be kept larger probably approximate to the PREF_DISTANCE_THRESHOLD

  2. In this case, there is no check for PREF_CUTOFF either. So even if it was problematic last year, the same center will still be chosen.

    This is solved as instead of taking the closest center we are taking centers_within_distance which properly filters the PREF_CUTOFF. In some cases it might not be good to remove the closest center from the centers_for_school options as the centers other than this may be far and hence inconvenient for students. In this case, the PREF value needs to be increased by solving the problem.

ArunShresthaa commented 6 months ago

I have created a pull requests that might solve CASE 1.

Considered using route distance instead of direct Haversine Distance. Considers the best possible route distance using maps api

Here is the PR #67

sumanashrestha commented 6 months ago

this is a useful discussion. want to add some context - there will always be somethings that we cannot model, one-off cases that need to be handled in special way. therefore there will be manual oversight by NEB staff on the list that is generated by our script. case 3, 4 will be handled during this step. few interesting cases -

CASE 1: @ArunShresthaa's #67 is more precise but has external dependencies and cost implication, cost to benefit does not make sense at this point. Interested can explore other options to see if there is a better alternative but haversine distance is quick and good enough approximation.

Case 2: will be dealt in #31

Closing this ticket