dendrograms / astrodendro

Generate a dendrogram from a dataset
https://dendrograms.readthedocs.io/
Other
37 stars 38 forks source link

Include params in HDF5 output #142

Closed e-koch closed 9 years ago

e-koch commented 9 years ago

The parameters dictionary isn't currently included when saving. This just adds that ability to the import and export functions.

e-koch commented 9 years ago

It should be mentioned that the parameters are also saved via the FITS output, despite the title of the PR.

@astrofrog @ChrisBeaumont - I think this is ready to be reviewed.

ChrisBeaumont commented 9 years ago

Looks good to me!

e-koch commented 9 years ago

I had the wrong values being passed to the wrong keywords in the header.

test_benchmark doesn't seem to test for differences in the parameters?

ChrisBeaumont commented 9 years ago

Ah good point -- perhaps this PR should also update the implementation of Dendrogram.__eq__(https://github.com/dendrograms/astrodendro/blob/master/astrodendro/dendrogram.py#L459) to compare parameter equality?

e-koch commented 9 years ago

Two issues have cropped up now that I'm not quite sure how to address properly:

e-koch commented 9 years ago

@ChrisBeaumont @astrofrog -- I've asked (and googled) around a bit about handling infs in FITS headers and have not gotten a decisive answer. Do either of you have an idea for handling this? Instead of using -np.inf, could min_val default to the minimum in the data?

astrofrog commented 9 years ago

@e-koch - as far as I know there is no standard way of representing infinity or NaN if FITS headers (but @embray can disagree if this has changed).

ChrisBeaumont commented 9 years ago

Instead of using -np.inf, could min_val default to the minimum in the data?

That sounds like a reasonable workaround to me.

test_benchmark computes the dendrograms using a custom function, so min_delta and min_npix are never set

Would it be easy in this PR to modify test_benchmark to fix this?

Sorry for letting this get stale. The code changes here look good to me so, if you can add the above workarounds to get the test suite to pass, I'm fine with merging

e-koch commented 9 years ago

@ChrisBeaumont - I altered min_val to be the minimum minus 1. Setting to the minimum causes no trunk to be created for data that is all the same value, which is completely unrealistic but it makes sense to me that it be recognized as a single structure. This cropped up in test_statistic_dimensionality.

As for test_benchmark, I'm disagreeing with my earlier comment that Dendrogram.__eq__ should test for differences in the parameters. Since used-defined functions can be given, one doesn't have to specify the parameters explicitly to get the same tree structure, so perhaps equality should only test the tree structure? min_val should be compared though.

e-koch commented 9 years ago

@ChrisBeaumont - I altered Dendrogram.__eq__ to only compare min_delta and npix if they were set (!=0, as by default). min_value is always compared.

Do you have any thoughts on this approach? It should enable equality tests in the cases that a custom function is provided (like in test_benchmark).

ChrisBeaumont commented 9 years ago

@e-koch 2 minor comments, but I'm OK with merging once they are addressed.

e-koch commented 9 years ago

@ChrisBeaumont - Thanks! Should be ready to merge now.

ChrisBeaumont commented 9 years ago

Thanks!