dendrograms / astrodendro

Generate a dendrogram from a dataset
https://dendrograms.readthedocs.io/
Other
37 stars 36 forks source link

Simple fix for incorrect "wraparound" catalog properties #121

Closed tomr-stargazer closed 9 years ago

tomr-stargazer commented 9 years ago

Here's a first crack at addressing #120, which noted that when a structure "wraps" around in coordinate space, its computed properties are broken.

This implementation (which is currently overly simple) checks to see if dendrogram.neighbours is anything other than the vanilla Dendrogram.neighbours. If it is, it assumes that you want to wrap in longitude space. Whenever it finds that you're computing stats on a structure that touches the "minimum" and "maximum" extents of the data, it assumes that it's a wraparound structure and shifts the bottom indices up by the length of the data. Later I'll want to make sure that the resultant computed properties (such as the x_cen, whatever) are within the bounds of the data (so e.g. I'd mod them by wrap_longitude)

For control, I've introduced an extra keyword "wrap_longitude" which percolates from ppv_catalog down to ScalarStatistic, but this wouldn't be the permanent name or form of such a variable.

Any thoughts on whether this implementation is heading in the right direction are welcome.

ChrisBeaumont commented 9 years ago

I don't think we should add the wrap_longitude kwarg for now, since it's overly specific. I'd rather special case the longitude wrapping without changing the call signatures, and see how far that gets us

tomr-stargazer commented 9 years ago

Sure thing; see my response to your inline comment, though.

tomr-stargazer commented 9 years ago

I moved the relevant logic to _make_catalog in this diff, I think it's much cleaner thanks to your suggestions. Big caveat: this is completely untested.

tomr-stargazer commented 9 years ago

I flipped the data-shifting logic, so that instead of shifting the rightmost edge to negative indices, the leftmost edge is shifted to the right, "above" the original index limits. For some reason this works better - the original solution was giving me nans in the output catalog but this one has sane outputs.

I'd be happy to try and write a test to check this behavior; otherwise, from my perspective I think this change is ready to merge, unless anyone double-checks it and sees something fishy.

tomr-stargazer commented 9 years ago

Here's extra evidence that this pull request fixes the problem -- when I recreate the plots from #120 with this new code, everything looks copacetic:

screengit

tomr-stargazer commented 9 years ago

@ChrisBeaumont any thoughts?

ChrisBeaumont commented 9 years ago

This approach looks good. We should definitely add a test for this, which would look something like:

Once that test is in place (and the above comments are addressed), this looks good to merge

tomr-stargazer commented 9 years ago

@ChrisBeaumont I wrote a test: https://github.com/dendrograms/astrodendro/pull/121/files#diff-92ef16435234722531b9cf09557e71efR404 which is passing for me. If travis passes this should be ready as far as I can tell.