DurhamARC / classroom-abm

Agent-based modelling for a classroom
MIT License
2 stars 2 forks source link

Make teachers within the same school more similar #146

Open alisonrclarke opened 2 years ago

alisonrclarke commented 2 years ago

e.g. if 2 classes with same school, generate control/quality values for both and move towards the mean/best one throughout the year.

MarkLTurner commented 2 years ago

We agreed to converge teachers iteratively once per month. Crucially, we must ensure that the convergence does not lead to teachers becoming identical. That is, the convergence must stop if teachers reach a given similarity threshold.

parnumeric commented 2 years ago

Shouldn't the convergence occur towards the most dominant teacher (how to define dominance?)? For now, let's try iterating towards the mean value (which is apparently already implemented in its initial form) and towards the best one (see the formula below). The best value for teacher variables can be defined as the one where the biggest increase in maths score is observed.

next_value = mean + convergence_factor * (old_value - best_value[school_id])

where next_value is one of {teacher_control, teacher_quality}

parnumeric commented 2 years ago

With many schools, we need to estimate MSE for each school separately, which I believe, as I have seen last week, needs modifying the R code included in the python code (the "_multilevelanalysis" subfolder). As MSE is calculated for the full simulated data (for all schools) currently, it appears that the easiest implementation would be passing simulated data for individual schools. Initial suggestion-idea was:

parnumeric commented 1 year ago

The first version has been implemented, but a different approach was used for generating individual simulated data for every school (very likely better than initially suggested):

To explain briefly some technicality of the simulation implementation:

The observed much longer simulations appear to be due to the volume of pupil data with schools supplied now in comparison with the previous datasets which were used before. The 2nd reason for longer simulations are currently due to many debugging outputs, but I'm planning to get rid of them hoping that it'll help in acceleration.

Another aspect of the simulation is that the speed of parameterisation testing itself might be much slower in the case of many schools, because it tries to test with every school individually. My suspicion is that the overall MSE might not be improving as fast as before in this case and then more iterations are needed to achieve the same quality of results.