csgillespie / poweRlaw

This package implements both the discrete and continuous maximum likelihood estimators for fitting the power-law distribution to data. Additionally, a goodness-of-fit based approach is used to estimate the lower cutoff for the scaling region.
109 stars 24 forks source link

problem with ppldis ? #24

Closed wrhaas closed 10 years ago

wrhaas commented 10 years ago

I would like to use the ppldis function in one-way KS tests, but my test of the procedure suggests an error in the ppldisc function. Here is the R code for my test:

x<-rpldis(1000, alpha=2, xmin=1) #synthetic dataset with known parameters ks.test(x, "ppldis", alpha = 2, xmin=1) #compares x to probability function with same parameter values. Should show no significant difference (p>0.1).

p is consistently 0.00 (with a high D of 0.6), which should not be the case. I should get a p that approaches 0, and I do get that result when I perform the test with continuous power law functions:

x<-rplcon(1000, alpha=2, xmin=0.1) ks.test(x, "pplcon", alpha = 2, xmin=0.1)

I am a bit confused how this could happen, and I suspect a problem with ppldis. Any thoughts?

(P.S., Thank you, Colin for a great set of tools)

wrhaas commented 10 years ago

I may have answered my own question...I now think the erroneous result follows from using a KS test on discrete data, which produces ties and violates an assumption of the test. If so, the error I report is not a problem with ppldis as originally suspected.

csgillespie commented 10 years ago

Hi,

Thanks for the feedback.

I think there may be two issues:

  1. discrete data in the KS test
  2. The discrete rng has a bug.

I'll investigate and try to update in the next few days.