Closed thuydotm closed 1 year ago
Base: 79.90% // Head: 79.90% // No change to project coverage :thumbsup:
Coverage data is based on head (
4f0e1d8
) compared to base (e47c278
). Patch coverage: 0.00% of modified lines in pull request are covered.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
Comparing benchmarking when applying naive binary search for gpu_bin(). The performance doesn't seem to be improved. We should investigate more to see whether we can better implement binary search for GPU, or just not use it.
$ asv compare b6c683b e47c278
All benchmarks:
ratio
[b6c683b5] [e47c2784]
<classify_binary_search_gpu> <classify_binary_search_gpu~3>
2.14±0.08ms 2.10±0.06ms 0.98 classify.EqualInterval.time_equal_interval(100, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
2.07±0.1ms 2.13±0.09ms 1.02 classify.EqualInterval.time_equal_interval(100, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
2.63±0.09ms 2.71±0.08ms 1.03 classify.EqualInterval.time_equal_interval(1000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
2.54±0.2ms 2.75±0.2ms 1.08 classify.EqualInterval.time_equal_interval(1000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
82.4±0.03ms 82.4±0.09ms 1.00 classify.EqualInterval.time_equal_interval(10000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
82.3±0.03ms 82.4±0.04ms 1.00 classify.EqualInterval.time_equal_interval(10000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
2.13±0.06ms 2.06±0.07ms 0.97 classify.EqualInterval.time_equal_interval(300, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
2.10±0.1ms 2.20±0.1ms 1.05 classify.EqualInterval.time_equal_interval(300, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
9.09±0.1ms 9.84±0.08ms 1.08 classify.EqualInterval.time_equal_interval(3000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
9.11±0.1ms 9.87±0.08ms 1.08 classify.EqualInterval.time_equal_interval(3000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
failed failed n/a classify.NaturalBreaks.time_natural_breaks(300, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
failed failed n/a classify.NaturalBreaks.time_natural_breaks(3000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
2.86±0.09ms 2.88±0.1ms 1.01 classify.Quantile.time_quantile(100, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
2.92±0.1ms 2.99±0.1ms 1.03 classify.Quantile.time_quantile(100, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
3.21±0.06ms 3.19±0.1ms 1.00 classify.Quantile.time_quantile(1000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
3.11±0.06ms 3.20±0.06ms 1.03 classify.Quantile.time_quantile(1000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
30.1±0.04ms 30.1±0.09ms 1.00 classify.Quantile.time_quantile(10000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
30.1±0.02ms 30.1±0.07ms 1.00 classify.Quantile.time_quantile(10000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
2.97±0.1ms 2.95±0.06ms 0.99 classify.Quantile.time_quantile(300, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
2.97±0.05ms 2.93±0.1ms 0.99 classify.Quantile.time_quantile(300, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
5.08±0.06ms 5.23±0.1ms 1.03 classify.Quantile.time_quantile(3000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
5.06±0.1ms 5.14±0.06ms 1.02 classify.Quantile.time_quantile(3000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
failed n/a n/a classify.Reclassify.time_reclassify(100, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit10.1-cupy-pyct]
1.62±0.09ms 1.71±0.08ms 1.05 classify.Reclassify.time_reclassify(100, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
failed n/a n/a classify.Reclassify.time_reclassify(100, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit10.1-cupy-pyct]
1.55±0.09ms 1.71±0.1ms 1.10 classify.Reclassify.time_reclassify(100, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
failed n/a n/a classify.Reclassify.time_reclassify(1000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit10.1-cupy-pyct]
1.65±0.09ms 1.82±0.07ms ~1.10 classify.Reclassify.time_reclassify(1000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
failed n/a n/a classify.Reclassify.time_reclassify(1000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit10.1-cupy-pyct]
1.71±0.08ms 1.71±0.1ms 1.00 classify.Reclassify.time_reclassify(1000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
failed n/a n/a classify.Reclassify.time_reclassify(10000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit10.1-cupy-pyct]
5.75±0.02ms 5.79±0.05ms 1.01 classify.Reclassify.time_reclassify(10000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
failed n/a n/a classify.Reclassify.time_reclassify(10000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit10.1-cupy-pyct]
5.76±0.03ms 5.77±0.06ms 1.00 classify.Reclassify.time_reclassify(10000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
failed n/a n/a classify.Reclassify.time_reclassify(300, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit10.1-cupy-pyct]
1.60±0.05ms 1.73±0.08ms 1.09 classify.Reclassify.time_reclassify(300, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
failed n/a n/a classify.Reclassify.time_reclassify(300, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit10.1-cupy-pyct]
1.59±0.1ms 1.70±0.07ms 1.07 classify.Reclassify.time_reclassify(300, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
failed n/a n/a classify.Reclassify.time_reclassify(3000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit10.1-cupy-pyct]
2.08±0.07ms 1.90±0.04ms 0.91 classify.Reclassify.time_reclassify(3000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
failed n/a n/a classify.Reclassify.time_reclassify(3000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit10.1-cupy-pyct]
2.00±0.06ms 1.93±0.1ms 0.97 classify.Reclassify.time_reclassify(3000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
@thuydotm any reason not to set this to ready
or since there isn't a big performance gain should we close this PR? Happy to take your lead on this
investigate more to see whether we can better implement binary search for GPU
Add an issue here: https://github.com/makepath/xarray-spatial/issues/767
Closing this PR as the proposed implementation does not help improving the performance.
Fixes #761