sunpy / ndcube

A base package for multi-dimensional contiguous and non-contiguous coordinate-aware arrays.
http://docs.sunpy.org/projects/ndcube/
BSD 2-Clause "Simplified" License
44 stars 48 forks source link

Only compute needed dims in _generate_world_coords #767

Closed svank closed 1 month ago

svank commented 1 month ago

@Cadair pointed me to axis_world_coords for verifying the order of the polarization axis in a DKIST/ViSP data set (which is an NDCube). It works, but for a dataset of shape (4, 2, 1100, 915, 2555) (polarization, repetition, scan step, wavelength, spatial), it computes a lot of world coordinates in 30 s (since rep, scan step and spatial are correlated), just to return the four Stokes coordinates. I dug in and took a shot at implementing the TODO, to have the underlying _generate_world_coords only run pixel_to_world for the axes that are being requested, offering a huge speedup in cases like this.

I rearranged axis_world_coords and axis_world_coords_values to first determine which world coordinate axes will be needed, and then pass those indices into _generate_world_coords, which skips processing any pixel dimension that doesn't contribute to those world dimensions.

Does this warrant a new test? I think that would look like verifying that _generate_world_coords returns just the throwaway value for unneeded axes.

My use case is

>>> import dkist
>>> ds = dkist.load_dataset("/home/svankooten/globus/pid_2_86/ALMMM/")
>>> ds.axis_world_coords("polarization.stokes")
(StokesCoord(['I', 'Q', 'U', 'V']),)

The last line takes 3.23 ms with this PR, and 34.8 s without.

Fixes #554

Cadair commented 1 month ago

Thank you so much for just jumping in and fixing this, it's been on my list for ages.

Will review later today.

Cadair commented 1 month ago

pre-commit.ci autofix

nabobalis commented 1 month ago

SUMMON BACKPORT BOT

nabobalis commented 1 month ago

@MeeseeksDev backport to 2.2

lumberbot-app[bot] commented 1 month ago

Owee, I'm MrMeeseeks, Look at me.

There seem to be a conflict, please backport manually. Here are approximate instructions:

  1. Checkout backport branch and update it.
git checkout 2.2
git pull
  1. Cherry pick the first parent branch of the this PR on top of the older branch:

    git cherry-pick -x -m1 84f6f624855d68c634e14565807040f2f8cb100c
  2. You will likely have some merge/cherry-pick conflict here, fix them and commit:

git commit -am 'Backport PR #767: Only compute needed dims in _generate_world_coords'
  1. Push to a named branch:
git push YOURFORK 2.2:auto-backport-of-pr-767-on-2.2
  1. Create a PR against branch 2.2, I would have named this PR:

"Backport PR #767 on branch 2.2 (Only compute needed dims in _generate_world_coords)"

And apply the correct labels and milestones.

Congratulations — you did some good work! Hopefully your backport PR will be tested by the continuous integration and merged soon!

Remember to remove the Still Needs Manual Backport label once the PR gets merged.

If these instructions are inaccurate, feel free to suggest an improvement.