Open SFashandi opened 1 year ago
@SFashandi Thank you for reporting the issue. Could you provide the data for points
used in your example?
Hi, I attached to this comment 2D points database which I've implemented in caculating G, F, and J Test. please consider this case. thanks.
@ljwolf there does seem to be an issue with setting support for the J function. In the notebook example, setting support
to be 15 will give me the warning: UserWarning: requested 15 bins to evaluate the J function, but it reaches infinity at d=24.7366, meaning only 10 bins will be used to characterize the J function. observed_support, observed_statistic = stat_function(
. Setting support
to be 20 will sometimes give me the error message ValueError: operands could not be broadcast together with shapes (13,) (15,)
. Can you look into this when you have a chance? BTW, both ripley.py and distance_statistics.py are implementations of distance based stats that are numpy oriented, should we remove ripley.py
(I ran into errors when using this module), update the notebook, and keep distance_statistics.py only?
Yes, only one needs to be kept, and you're right that distance_statistics
is the correct one to keep.
On the shaping issue, I think it's related to undefined behaviour for some of the functions at the edges of their support. I will try to get this fixed ASAP!
Thanks @ljwolf!
@SFashandi you may want to replace from pointpats import ripley
with from pointpats import distance_statistics as ds
and access all the distance functions from ds
instead of ripley
. So instead of ripley.j_test(points, support=20)
, you will use ds.j_test(points, support=20)
. Also change the value passed to the parameter support
to be smaller than 20, which should help avoid the error you are currently encountering. Our team will investigate the issue to provide a more satisfactory solution later on. Thank you for reporting this!
ok @weikang9009, I think it's sufficient for now to use truncate=False
when running the statistical tests.
To be clear, the issue is that the g()
and f()
functions can hit their limiting/undefined values at different times. Truncating g()
separately from f()
, then, can mean your g()
statistic vector is too short (or long) to compare to your f()
statistic vector. Since j()
is the quotient of f()
and g()
, that's a problem.
So, we must use the full-length g()
or f()
, then truncate (if requested) after the fact.
We truncate because it's annoying for the user to get back arrays that can't quickly be plotted. If you ask for a support of 20, but the last 2 are np.inf
(which is a valid return value for the j()
function), then your plots are all blank. Truncating keeps the visible range of the function clear.
Hi, Running ripley.j_test(points, support=20) give rise to a value error like below where other tests (G, F, etc.) works well:
ValueError: operands could not be broadcast together with shapes (16,) (18,)
Thanks for your consideration.