Open ChrisBeaumont opened 11 years ago
One reason why I like being explicit is because we want users to be able to take the examples in the docs and apply it to their cases, which they can't really do if we are hard-coding specific examples with special methods. So I would personally prefer not to have this kind of convenience provided (to me, the data shouldn't be part of the package itself, it's an example file for the docs).
On the other hand, I think we could consider allowing a filename to be passed to compute, which would already simplify a number of examples.
Ok, you can make the final call here. I like being able to %paste
code snippets in IPython, to quickly run examples -- as stupid as it is, an extra step like a data download has prevented me from trying examples before. I like how scikit-learn handles this stuff: http://scikit-learn.org/stable/auto_examples/applications/plot_outlier_detection_housing.html
and
https://github.com/scikit-learn/scikit-learn/tree/master/sklearn/datasets/data
But it's just a preference, and I defer to you. Either way, I agree compute
should do something sensible if given a fits path (i.e. load and use the first HDU)
@ChrisBeaumont - I'll look into it a bit more. In this case, the reason why I think copying and pasting verbatim isn't particularly useful is because the user won't learn much more than what's already in the docs. IMHO they are most likely to copy and paste and modify, which means they want to use it on their own files. But I want to take some time to think about it and have a look over the examples you show.
One likely use case where they won't care about using their own data is to get a handle on the data structure, the API, the impact of pruning parameters, etc. People may want to do that before they dive in with their own data. Just something to think about.
After thinking about this a bit more, I agree that something like this would be handy, though I think we need to ensure we keep some of the initial examples with the full loading, and that we also make sure we provide links from time to time to say 'if you want to load your own dataset, see X'. I can try and work on this this afternoon based on what you have here.
@ChrisBeaumont - could you rebase this, then I can open a pull request to your branch with suggestions.
ok, done
Oops, it looks like this will need rebasing again. Once you've done that, I'll open a pull request to your branch with suggestions - but in principle I now like this idea. I just think that the first time we show the example we should show the full thing and add a little paragraph explaining where the load
functions are coming from.
@ChrisBeaumont - here are a couple of ideas:
load_perseus()
, simply store the path to the example data as a string (e.g. example_perseus
) and pass that to compute
- and also modify compute
so it can read from FITS files as discussed above.from astrodendro import Dendrogram, examples
Dendrogram(examples + 'PerA_Extn2MASS_F_Gal.fits')
Then it shows more explicitly that the argument is a path. What do you think?
I don't like examples
as a path string -- that isn't obvious from the import statement, and usually you import code, not data. Plus, the PerA_Extn...
name is cumbersome.
What do you think about a function example_file('perseus') -> path_to_perseus_file
? This, combined with the ability for compute
to auto-load a fits file from a path, addresses the original motivation for this issue (make using example data easier and more readable)
Ok, that sounds like a reasonable compromise - do you want to update the examples to show this?
Summary of a chat we had at CfA with @ChrisBeaumont and @keflavich:
One option - the first time, show the full thing, then explain that in the rest of the docs, we use a convenience to replace the from astropy.io import fits
and fits.getdata
lines, then use the convenience in the other examples to avoid always repeating the I/O code.
We decided to delay this until after 0.1 release.
This is what I had in mind for #60 -- instead of asking users to download files like the Perseus extinction map, why not bundle it with the code with an easy loader function?