dendrograms / astrodendro

Generate a dendrogram from a dataset
https://dendrograms.readthedocs.io/
Other
37 stars 38 forks source link

Example custom merge test function #10

Closed astrofrog closed 11 years ago

astrofrog commented 11 years ago

This is an example implementation of what I was suggesting in #7.

The default:

d = Dendrogram.compute(image)

is equivalent to

d = Dendrogram.compute(image, merge_test_function=lambda n, **k: False)

and another extreme example is:

d = Dendrogram.compute(image, merge_test_function=lambda n, **k: True)

which will create a tree with no structures (i.e. 'always merge'). If we decide to go ahead with this, I will add tests.

astrofrog commented 11 years ago

I guess this should also be used in the following line @bradenmacdonald was suggesting:

https://github.com/dendrograms/dendro-core/blob/master/astrodendro/dendrogram.py#L214

so this requires some more thought - but is the basic idea ok?

astrofrog commented 11 years ago

I updated to also apply the custom criteria for orphan leaves, and updated the description for this PR (using a function that always returns True results in a tree with no structures).

bradenmacdonald commented 11 years ago

@astrofrog, code-wise, this approach looks like a good start to me. I ran the benchmark with these changes and found it had no noticeable effect on computation speed, which is important.

Two comments:

  1. The name merge_test_function does not really apply to the second place it is used, which is where orphan leaves are pruned. Perhaps I was wrong to suggest using it there, or perhaps it would be better named something like check_independence() (has the opposite meaning of merge_test_function, i.e. check_independence(...) == not merge_test_function(...)).
  2. We'll want to include unit tests for these, of course.
ChrisBeaumont commented 11 years ago

What is the status of #10, #8, and #7? It sounds like they are all fundamentally the same issue -- this one has code attached. Is this ready for review?

astrofrog commented 11 years ago

This still needs tests to be considered ready to merge, but it is ready for testing. Feel free to try it out to see if the API suits what we need to do, and also whether it works!

astrofrog commented 11 years ago

I've now added a simple test for this and it seems (at first glance) to behave as expected. I've also added tests to ensure that the benchmark cases are reproduced if we use is_independent to check for min_delta and min_npix instead of using the actual keyword arguments.

@ChrisBeaumont - what else do you think we need to consider before we merge this? Should we add a benchmark test specifically for is_independent with another criterion (e.g. position-related, or other?).

astrofrog commented 11 years ago

I should say, this is pretty much ready for review - it's just a question of whether we want to add more/different tests.

ChrisBeaumont commented 11 years ago

I think this looks good, and should be general enough to satisfy most custom pruning schemes. I ran a quick sanity check trying to implement one of the builtin pruning options in the c++ code. The dendrogram I got out was slightly different, but this could well be due to some other part of the code unrelated to is_independent.

astrofrog commented 11 years ago

Thanks for checking this. I guess I should add a section about this in the documentation (which is growing fast!).

ChrisBeaumont commented 11 years ago

Ah, right.

The docs are really excellent, by the way. I'm learning a lot (both about sphinx, and about how to write good docs)

astrofrog commented 11 years ago

@ChrisBeaumont - I've now added a section to the docs:

https://dendrograms-astrofrog-v2.readthedocs.org/en/latest/advanced.html

I put this in the 'advanced' section since it's easier to show it if we make use of plotting, so I have to show it after plotting, etc. Also, it is an advanced topic :)

I'm going to be offline for a few days, so feel free to merge this if you think it's ready.