astrolabsoftware / spark3D

Spark extension for processing large-scale 3D data sets: Astrophysics, High Energy Physics, Meteorology, …
https://astrolabsoftware.github.io/spark3D/
Apache License 2.0
30 stars 16 forks source link

Large API refactoring: introducing spark3D 0.3 #108

Closed JulienPeloton closed 5 years ago

JulienPeloton commented 5 years ago

This PR introduces the new spark3D API.

Previously on spark3D

The previous versions (0.1, 0.2) had several limitations:

And now...

spark3D should be viewed as an extension of the Apache Spark framework, and more specifically the Spark SQL module, focusing on the manipulation of three-dimensional data sets:

spark3d_newapi

The focus is now done on repartitioning. The biggest feature as of now is the possibility to perform exact DataFrame repartitioning via df.repartitionByCol.

More to come with this PR.

codecov-io commented 5 years ago

Codecov Report

Merging #108 into master will decrease coverage by 9.52%. The diff coverage is 84.34%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #108      +/-   ##
==========================================
- Coverage   96.37%   86.84%   -9.53%     
==========================================
  Files          32       29       -3     
  Lines        1240     1140     -100     
  Branches      218      201      -17     
==========================================
- Hits         1195      990     -205     
- Misses         45      150     +105
Flag Coverage Δ
#python 93.63% <ø> (-0.68%) :arrow_down:
#scala 84.07% <84.34%> (-13.18%) :arrow_down:
Impacted Files Coverage Δ
...ark3d/spatialPartitioning/SpatialPartitioner.scala 0% <0%> (-20%) :arrow_down:
...m/spark3d/spatialPartitioning/KeyPartitioner.scala 100% <100%> (ø)
...park3d/spatialPartitioning/OctreePartitioner.scala 43.75% <100%> (-56.25%) :arrow_down:
src/main/scala/com/spark3d/utils/GridType.scala 100% <100%> (ø) :arrow_up:
...main/scala/com/spark3d/python/PythonClassTag.scala 100% <100%> (ø) :arrow_up:
src/main/scala/com/spark3d/package.scala 100% <100%> (ø)
src/main/scala/com/spark3d/Checkers.scala 100% <100%> (ø)
src/main/scala/com/spark3d/Partitioners.scala 87.03% <73.91%> (ø)
src/main/scala/com/spark3d/Repartitioning.scala 80.43% <80.43%> (ø)
...spark3d/spatialPartitioning/OnionPartitioner.scala 48.33% <92.85%> (-42.98%) :arrow_down:
... and 16 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 2d22e66...c047d36. Read the comment docs.

JulienPeloton commented 5 years ago

Merging this first stable interface. Version number has not been bumped yet.