douglasrizzo / catsim

Computerized Adaptive Testing Simulator
GNU Lesser General Public License v3.0
124 stars 35 forks source link

Serious bug in HillClimbingEstimator? Causing only first iteration to matter #13

Closed machow closed 5 years ago

machow commented 5 years ago

Hey, I think that oftentimes the HillClimbingEstimator only does 1 round of estimation (rather than 10, iteratively more precise rounds). The issue is...

image

I need to patch within the next couple days on DataCamp/catsim, so can submit a PR to fix!

Solution

Code here:

https://github.com/douglasrizzo/catsim/blob/master/catsim/estimation.py#L138-L139

seems like the solution is that max_ll needs to be reset each round?

douglasrizzo commented 5 years ago

Hi there, thanks for the report.

There's definitely something fishy going on with the estimator. It seems like I was trying to implement some kind of early stopping: theta values are checked from left to right and, when the log-likelihood of these values went up, but then started going down, the next candidate theta values didn't need to be checked. However, the implementation seemed to be wrong.

Also, the new intervals I generated in subsequent rounds were also wrong, as you pointed out. All of this made the estimator not be as precise as it can be.

I decided to remove the early stopping idea and implemented something a little more straightforward. After each iteration, the new candidate theta values to be checked are generated around the current best theta, but in smaller intervals. I added some comments in the code and used more descriptive variable names, since it has been a long time since I last worked in this code.

I committed a possible fix to the testing branch. Take a look if it makes sense to you. The lines you were interested in are now here. An example verbose output from the hill climbing estimator can be seen below. Values with larger log-likelihood are printed and we can now see that subsequent passes do generate better thetas.

Pass: 1
    Bounds: -1.735914394995148 0.9443143368283956
    Interval size: 0.2978031924248381
        Theta: -1.735914394995148, LL update: inf
        Theta: -1.43811120257031, LL update: 0.026153563041037664
Pass: 2
    Bounds: -1.735914394995148 -1.140308010145472
    Interval size: 0.06617848720551955
Pass: 3
    Bounds: -1.5042896897758296 -1.3719327153647904
    Interval size: 0.014706330490115382
        Theta: -1.4307580373252522, LL update: 3.4473263991863234e-05
Pass: 4
    Bounds: -1.4454643678153676 -1.4160517068351368
    Interval size: 0.0032680734422478874
        Theta: -1.4291240006041284, LL update: 3.219923880415365e-06
        Theta: -1.4258559271618805, LL update: 1.5761288620907976e-06
Pass: 5
    Bounds: -1.4291240006041284 -1.4225878537196326
    Interval size: 0.0007262385427218021
        Theta: -1.4269452849759632, LL update: 1.9605989365345522e-07
Pass: 6
    Bounds: -1.427671523518685 -1.4262190464332414
    Interval size: 0.00016138634282714115
        Theta: -1.4268645918045497, LL update: 1.023564344393435e-08
        Theta: -1.4267032054617226, LL update: 8.590400391028652e-09
Pass: 7
    Bounds: -1.4268645918045497 -1.4265418191188954
    Interval size: 3.5863631739463386e-05
Pass: 8
    Bounds: -1.426739069093462 -1.4266673418299831
    Interval size: 7.969695942078303e-06
        Theta: -1.4266992206137514, LL update: 1.1692868895352149e-11
Pass: 9
    Bounds: -1.4267071903096935 -1.4266912509178094
    Interval size: 1.771043542708739e-06
        Theta: -1.4266983350919802, LL update: 1.2860823517257813e-12
machow commented 5 years ago

Thanks for your quick response! I think the way you did it previously, going...

seemed like a good, slightly more efficient strategy, given that the likelihood is concave (at least for 2PL model, off-the-cuff I think this is the case for 3 and 4PL? Could be very wrong about this?). Once the next candidate has a lower likelihood, then the likelihood has started going down, so the rest will also be lower (and don't need to be checked).

As long as the best theta is reset each round it should be okay (since it will pass over the previously best theta with increasing granularity).

douglasrizzo commented 5 years ago

I see what you mean. If I reset best_theta and max_ll in every search pass and if I am sure that the current best theta is included in all future search passes, I am guaranteed to at least find the same best_theta every time, but also have the chance to find a better one.

if I am sure that the current best theta is included in all future search passes

The thing is, I never assumed this condition. If you have observed it, it was there by coincidence. I also didn't want to reset best_theta and max_ll on every search pass because, if there was a chance that a better theta was found in a previous search pass, the function would return that value and not the best value found in the current pass.

I decided to just explicitly keep the value of the current log-likelihood (current_ll) and the value of the log-likelihood from the previous theta (previous_ll) and, if current_ll < previous_ll, it means the function is only going to go downhill, so we can stop the search and start another pass.

The implementation of early stopping is here.

I can think of ways to do it better, like search once to the left and once to the right of the current best theta, in order to know which side we actually have to search. This can cut the search space in half and it's not hard to implement.

douglasrizzo commented 5 years ago

I'll close this as it appears the issue has been fixed. If you find other anomalies, please report them and I'll try to fix them as quickly as possible. We can continue discussion on this bug here, nevertheless.

machow commented 5 years ago

Thanks for the fix!

That's fair that there could be some small improvements. I guess knowing whether the crnt best theta is to the left or right of the best theta out of the crnt round's values contains a lot of information for limiting the search range.

As is, is super helpful--thanks again for being so responsive!