Closed Randy-Ram closed 5 years ago
Hey @HyperManTT. If I remember correctly, the items
matrix stores items discrimination, difficulty, guessing and upper asymptote (parameters a, b, c and d of the 4PL IRT model) in columns 0, 1, 2 and 3, respectively.
Column 4 is used to store items exposure rates. It should be created by catsim
at some point at the start of a simulation, with all values equal to 0. In this case it seems this column is not being created.
The line you mentioned in the PR and issue #14 is trying to filter out items with high exposure rates. For example, in order to avoid too many examinees from viewing the same item, one can limit the percentage of tests an item appears by setting the r_max
parameter.
In any case... If I remember correctly, column with index 3 should contains the items' upper asymptote parameter and column 4 (which is absent) would be the right one to mention in that line. We just have to find out where column 4 is...
I could be wrong, however, and index 3 is actually what we need, but I'll have to get back to you. I am a little busy right now, unfortunately.
No problem @douglasrizzo , I checked around and even printed out the results from generate_item_bank
and only saw up to column 3. Check it out if you get time, I'll play around with it in the meantime.
I have started implementing a possible fix. It will be up in the next few days.
Oh great, thanks Douglas. You'd know better than me but I had identified the MaxInfoSelector and the ClusterSelector as needing the 4th column and inserted it, the same way the simulator does it.
Yes, if you pass a fifth column with values between 0 and 1, representing the items exposure rates, the two selectors should behave as expected.
I've made the two selectors issue a warning, stating that the necessary column is missing. They treat the absent column as if items were never used before (exposure rates = 0). If the column is present, they'll take the exposure rates in consideration.
Hey Douglas,
I went back and checked on this today and noticed this piece of the code was throwing an error in Issue #14
[index for index in valid_indexes if items[index, 4] < self._r_max]
The code is referencing index 4 in a 4 column item_bank ndarray. When I changed the above to the below, it works as intended.
[index for index in valid_indexes if items[index, 3] < self._r_max]
Note, I just looked at the MaxInfoSelector class alone, I'm assuming this '4' index is used elsewhere in the file as well.