vaexio / vaex

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
https://vaex.io
MIT License
8.3k stars 591 forks source link

healpix_plot & healpix_count expecting Gaia conventions #41

Open mfouesneau opened 7 years ago

mfouesneau commented 7 years ago

The current API is based on Gaia defaults which are not obvious outside Gaia: healpix_expression='source_id/34359738368', ... nest=True, ...

These are not the defaults in healpy. Healpy by defaults does nest=False for instance.

Maybe providing a virtual column that does the healpix calculations from RA, DEC columns similarly to the coordinate conversions?

maybe adapting properly something like:

def add_columns_healpix_data(ds, coords=('ra', 'dec'), 
                          name='source_id', nest=True,
                          lonlat=True, NSIDE=4096):
    """ Vaex defaults to Gaia conventions for healpix handling

    namely it defaults to having Healpix cell numbers as
    'source_id / 34359738368'

    It also uses nested pixels (not rings)
    """
    import healpy
    # add healpix numbers for plotting
    # using Gaia conventions
    ra, dec = coords
    _ra = ds.evaluate(ra)[:]
    _dec = ds.evaluate(dec)[:]
    hpix = healpy.ang2pix(NSIDE, _ra, _dec, 
                          nest=nest, lonlat=lonlat)
    # convention from gaia
    ds.add_column(name, (34359738368 * hpix).astype(int))
    return ds
maartenbreddels commented 7 years ago

Hi Morgan,

the default healpix expression is different in master, since it was indeed really gaia specific. However, the default for nested or not i'm not sure about, I have a preference for it since it is easy to go to coarser level by simply dividing. (I don't think you can do that for ring right?).

There is a method for this already I hope that suits your needs. Thanks anyway!

cheers,

Maarten

mfouesneau commented 7 years ago

Hi Maarten, I found my way with the methods. I suggests to update the documentation and make the healpix_plot expression more standard or a necessary argument.

I also prefer the nested structure. But that's something I did not see obvious when I made some data.

Thanks

maartenbreddels commented 7 years ago

I'm actually just playing with the healpix_plot, and indeed it is still set to the gaia values, which does not make sense, I need to change that.