astronomy-commons / hipscat

Hierarchical Progressive Survey Catalog
https://hipscat.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
14 stars 3 forks source link

Consider alternative healpix libraries #53

Open delucchi-cmu opened 1 year ago

delucchi-cmu commented 1 year ago

We're currently using healpy, which is a somewhat bloated library (also contains matplotlib, cfitsio, etc), and has installation issues on macOS, and is incompatible with Windows.

Alternatives suggested so far:

If any of these are a little behind healpy in terms of raw performance, but otherwise a good replacement, it may be worthwhile to contribute improvements to their code bases.

hombit commented 1 year ago

Performance comparison for healpy, hpgeom, cdshealpix and astropy-healpix on my laptop:

import healpy
import hpgeom
import cdshealpix
import astropy_healpix

import astropy
import numpy as np

n = 1_000_000

rng = np.random.default_rng(0)
ra = rng.uniform(0, 360, n)
dec = np.degrees(np.arccos(rng.uniform(-1, 1, n))) - 90

%timeit healpy.ang2pix(1 << 29, ra, dec, lonlat=True, nest=True)
%timeit hpgeom.angle_to_pixel(1 << 29, ra, dec, lonlat=True, degrees=True, nest=True)
%timeit cdshealpix.nested.lonlat_to_healpix(astropy.coordinates.Longitude(ra, 'deg'), astropy.coordinates.Latitude(dec, 'deg'), 29, num_threads=1)
%timeit cdshealpix.nested.lonlat_to_healpix(astropy.coordinates.Longitude(ra, 'deg'), astropy.coordinates.Latitude(dec, 'deg'), 29, num_threads=8)
%timeit astropy_healpix.lonlat_to_healpix(ra * astropy.units.deg, dec * astropy.units.deg, nside=1 << 29, order='nested')
delucchi-cmu commented 1 year ago

Keeping this for posterity. If we can get a Windows-compatible healpix library, we can enable a windows-file-system checking regression test workflow:

# This workflow will install Python dependencies and run tests in a Windows environment.
# This is intended to catch any file-system specific issues, and so runs less
# frequently than other test suites.

name: Windows unit test

on:
  push:
    branches: [ main ]
  pull_request:
    branches: [ main ]

jobs:
  build:

    runs-on: windows-latest
    strategy:
      matrix:
        python-version: ['3.10']

    steps:
    - uses: actions/checkout@v3
    - name: Set up Python ${{ matrix.python-version }}
      uses: actions/setup-python@v4
      with:
        python-version: ${{ matrix.python-version }}
    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
        pip install .
        pip install .[dev]
    - name: Run unit tests with pytest
      run: |
        python -m pytest tests
delucchi-cmu commented 1 year ago

Ugh. Another ding.

https://github.com/astronomy-commons/hipscat/pull/65

maxwest-uw commented 1 year ago

going through all the available healpix implementations and checking to see if they've ported all the healpy functions that we need to handle margin caching.

hpgeom

cdshealpix

astropy-healpix

was really rooting for cdshealpix but it unfortunately doesn't have all the necessary functions ported over. if they do end up implementing healpy.boundaries we should consider using it as it's fast and could even up allowing us to delete some code.

maxwest-uw commented 11 months ago

small update on this, since the margin cache code has been significantly changed: we're still reliant on the healpy.boundaries function, probably more so now in fact 😭

eteq commented 4 months ago

Coming back here after some out-of-band discussion at a hipscat WG meeting: am I reading it right that astropy-healpix is compatible with everything, but the main restriction is performance? (basing that on https://github.com/astronomy-commons/hipscat/issues/53#issuecomment-1472677860) And if so, any insight on where the performance hotspot is @delucchi-cmu ?

eteq commented 4 months ago

(to be clear, I also am interested in cdshealpix given the rust under-layer, since it seems to support a lot of the same user-level constructs that I like about astropy-healpix, but if I am reading the above it seems like it's out of the running without boundaries?)

hombit commented 4 months ago

@eteq @delucchi-cmu I slightly updated the original benchmarking comment https://github.com/astronomy-commons/hipscat/issues/53#issuecomment-1472677860, basically fixing typos in the code. The performance of the packages hasn't change since I tested it last time.

DaftPict commented 1 month ago

I just happened to need cdshealpix for another project and in reading the docs noticed that another package it can use (mocpy) had a get_boundaries function. As a old-school astronomer rather than a modern developer I just wondered if this would solve the missing boundaries issue mentioned earlier and allow a switch to cdshealpix? I'm stuck on windows and would love a version I can use locally.

fxpineau commented 2 weeks ago

Hi, I am the developer of cds-healpix-rust, @tboch talked to me about this issue. I think that the equivalent of boundaries in cdshealxpix is path_along_cell_edge. If needed, we can add it in the cdshealpix python wrapper (cc @bmatthieu3, @ManonMarchand).

In such cases (feature requests), do not hesitate to open issues.

ManonMarchand commented 2 weeks ago

Yes, don't hesitate to interact either on mocpy or cdshealpix repos. You're more than welcome to ask for features, point places in the docs where we're not clear, point to bugs or slow methods... or even contribute :slightly_smiling_face: If github issues are not your preferred communication canal, our emails are also in the documentation

fxpineau commented 2 weeks ago

I just checked the code. Using vertices_skycoord with step>1 corresponds to the boundaries method: when step is >1 the Rust method path_along_cell_edge is used internally. See this line of code.