Closed jeremyevans closed 9 years ago
Is a design decision. You could calculate alpha using all variables with variance>0, but you should give an advice to the user first. Because Statsample is a in-program library, not a REPL one, I decided to be strict with this one. Maybe a more general function should be created, that will be resilient to problems on input.
I guess I don't understand why having a vector with variance = 0 indicates any problem with the input. It seems to be to be a normal situation that the library can and should handle. Now if all vectors have variance 0, I can see returning nil. Maybe change any?
to all?
From a psychometric perspective, one or more items with 0 variance is very serious, because implies a bad selection of items. The meaning of the index (lower bound of correlation for equal size tau-equivalen measurement) not longer apply directly, because the library omits one or more variables. So, I should give a warning about it. Anyway, as R does, I can provide an option to relax requirements, as na.rm on mean function.
I'd say that what you said is true for large datasets. In my case, I was calculating alpha from a small dataset (16 takers), and there were multiple questions that everyone got right. Since alpha can be calculated correctly even if some vectors have variance = 0, I don't see the reason to purposely refuse to calculate it. It should be up to the user to determine the meaning of the result, the library's responsibility is just to perform the calculation.
At the very least, if you are going to refuse to calculate alpha because of artificial restrictions, please raise an error with a descriptive message indicating why. Returning nil is bad as it doesn't indicate why the calculation was not done. When I first used the library and got nil, I thought I was doing something wrong, and it caused quite a bit of extra debugging time.
Ok, you convince me. I will put an option to raise a error (strict mode), but we should delete any vector with variance=0.
Hi @justin808, thanks for the pull request! We're currently in the process of centralizing SciRuby's gems in the organization repositories. Can you reopen your PR on sciruby/statsample?
Thanks! I'll take a look at your PR as soon as I finish moving the other gems' issues there. :)
Hi @agarie How do I reopen my PR? Can you please give me a link?
Hey @justin808, I found a page on the documentation showing how to change to which repository you send the PR to: https://help.github.com/articles/using-pull-requests/#changing-the-branch-range-and-destination-repository.
So, if you close these two, you probably can create new PRs pointed to SciRuby/statsample.
This was fixed in the new upstream, so it can be closed now.
There is no reason to return nil if a single vector has a 0 variance. For example, let's say you are giving a test, and every single taker gets the easiest question correct. The variance for that question vector is 0, but Cronbach's alpha can still be calculated correctly for the entire dataset.