๐๐ฎ๐๐ฎ, ๐๐ป๐ฎ๐น๐๐๐ถ๐ฐ๐ & ๐๐. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
refactor geometry functions to improve performance
Remove geos dependencies, use GeomProcessor to read SRID, avoid repeated deserialization.
Remove the parse_to_subtype function, avoid redundant data copying when parse geometry data
Optimize geometry function code structure
for example:
The execution time for calculating 500,000 rows geometry st_distance has been reduced from 9.097 secs to 5.951 secs
CREATE OR REPLACE TABLE test AS
SELECT
number AS id,
'SRID=4326;POINT(100 200)' as geom,
'SRID=4326;POINT(300 200)' as geom2
FROM numbers(500000);
old
select st_distance(to_geometry(geom), to_geometry(geom2)) from test;
500000 rows read in 9.097 sec. Processed 500 thousand rows, 38.15 MiB (54.96 thousand rows/s, 4.19 MiB/s)
new
select st_distance(to_geometry(geom), to_geometry(geom2)) from test;
500000 rows read in 5.951 sec. Processed 500 thousand rows, 38.15 MiB (84.01 thousand rows/s, 6.41 MiB/s)
fixes: #[Link the issue here]
Tests
[x] Unit Test
[x] Logic Test
[ ] Benchmark Test
[ ] No Test - Explain why
Type of change
[ ] Bug Fix (non-breaking change which fixes an issue)
[ ] New Feature (non-breaking change which adds functionality)
[ ] Breaking Change (fix or feature that could cause existing functionality not to work as expected)
I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/
Summary
refactor geometry functions to improve performance
geos
dependencies, useGeomProcessor
to readSRID
, avoid repeated deserialization.parse_to_subtype
function, avoid redundant data copying when parse geometry datafor example:
The execution time for calculating 500,000 rows geometry
st_distance
has been reduced from 9.097 secs to 5.951 secsfixes: #[Link the issue here]
Tests
Type of change
This change isโ