Open epifanio opened 1 year ago
Hi, can you tell me what version fails? We don't have a branch called "master". We do have "main". I don't recognize this version number.
'0+untagged.1758.gc856fa0'
Sorry, I mean main
, the version that fails is by building cuspatial
main
branch.
I Built using docker:
ARG MOTHER=epinux/jammy_mamba_gpu
FROM $MOTHER
# mother machine derived from nvidia/cuda:11.8.0-devel-ubuntu22.04 with mamba pre-installed
LABEL maintainer="massimods@met.no"
ENV DEBIAN_FRONTEND=noninteractive
ENV PYTHONUNBUFFERED=1
RUN mamba install --yes git
COPY build_cuspatial.sh /home/jovyan/build_cuspatial.sh
RUN sh /home/jovyan/build_cuspatial.sh
where build_cuspatial.sh
is:
#!/bin/bash
. /opt/conda/etc/profile.d/conda.sh
conda activate base
git clone https://github.com/rapidsai/cuspatial /home/jovyan/cuspatial && cd /home/jovyan/cuspatial
export CUSPATIAL_HOME=/home/jovyan/cuspatial
conda env create --name all_cuda-118_arch-x86_64 --file conda/environments/all_cuda-118_arch-x86_64.yaml
The version:
'0+untagged.1758.gc856fa0'
is the one I tested that doesn't not have the afore mentioned issue
and is built is from a cuspatial fork, where I was using a different version for GDAL
I am seeing a couple different errors wIth point_in_polygon. 1.
import shapely
import cuspatial
polygon = shapely.Polygon(
[
(0, 0), (0, 1), (1, 1), (2, 0), (0, 0)
]
)
point = shapely.Point(0.5, 0.5)
print(cuspatial.point_in_polygon(cuspatial.GeoSeries([point]), cuspatial.GeoSeries([polygon]))[0][0])
which gives
tabulate: failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered
at line 84 in join.py, the call to cpp_point_in_polygon()
2.
import cuspatial
polygon2coords = np.array([0,0,0,1,1,1,2,0,0,0]).astype('float')
polygon2 = cuspatial.GeoSeries.from_points_xy(polygon2coords)
point2coords = np.array([0.5,0.5]).astype('float')
point2 = cuspatial.GeoSeries.from_points_xy(point2coords)
result = cuspatial.point_in_polygon(point2, polygon2)
which gives
Length mismatch: expected 0 elements, got 5 elements
from line 91 in join.py. In other words, this attempt returned from cpp_point_in_python() without error, but with a null result. I also tried the above example with many thousands of data points, but got the same result.
Version: cuspatial 23.08 cuda 11.8 ubuntu 23.04 python 3.10
Installation conda create -n test_cuda_env -c rapidsai -c conda-forge -c nvidia cudf=23.08 cuspatial=23.08 python=3.10 cuda-version=11.8 --no-channel-priority
CUDA Information CUDA Device Initialized : True CUDA Driver Version : 12.2 CUDA Runtime Version : 11.8 CUDA NVIDIA Bindings Available : True CUDA NVIDIA Bindings In Use : False CUDA Minor Version Compatibility Available : True CUDA Minor Version Compatibility Needed : False CUDA Minor Version Compatibility In Use : True CUDA Detect Output: Found 1 CUDA devices id 0 b'NVIDIA GeForce GTX 1650' [SUPPORTED] Compute Capability: 7.5 PCI Device ID: 0 PCI Bus ID: 1 UUID: GPU-5974e587-c1f4-49b3-a957-c5edb37aebcf Watchdog: Enabled FP32/FP64 Performance Ratio: 32 Summary: 1/1 devices are supported
Note also, I had to make a small change in cuspatial.core.spatial.join.py at lines 77:80 to get the above results:
poly_offsets = as_column(polygons.polygons.part_offset)
ring_offsets = as_column(polygons.polygons.ring_offset)
# px = as_column(polygons.polygons.x)
# py = as_column(polygons.polygons.y)
px = as_column(polygons.points.x)
py = as_column(polygons.points.y)
Hi @jeb2112, your first example does not fail for me with cuSpatial 23.10 (upcoming release, but this should be the same as 23.08). It returns True
but gives a warning (that you can ignore).
True
[/home/coder/.conda/envs/rapids/lib/python3.10/site-packages/numba/cuda/dispatcher.py:538](https://vscode-remote+dev-002dcontainer-002b7b22686f737450617468223a222f686f6d652f6d6861727269732f7261706964732f63757370617469616c222c22636f6e66696746696c65223a7b22246d6964223a312c22667350617468223a222f686f6d652f6d6861727269732f7261706964732f63757370617469616c2f2e646576636f6e7461696e65722f6375646131322e302d636f6e64612f646576636f6e7461696e65722e6a736f6e222c2270617468223a222f686f6d652f6d6861727269732f7261706964732f63757370617469616c2f2e646576636f6e7461696e65722f6375646131322e302d636f6e64612f646576636f6e7461696e65722e6a736f6e222c22736368656d65223a2266696c65227d7d.vscode-resource.vscode-cdn.net/home/coder/.conda/envs/rapids/lib/python3.10/site-packages/numba/cuda/dispatcher.py:538): NumbaPerformanceWarning: Grid size 1 will likely result in GPU under-utilization due to low occupancy.
warn(NumbaPerformanceWarning(msg))
[/home/coder/.conda/envs/rapids/lib/python3.10/site-packages/numba/cuda/dispatcher.py:538](https://vscode-remote+dev-002dcontainer-002b7b22686f737450617468223a222f686f6d652f6d6861727269732f7261706964732f63757370617469616c222c22636f6e66696746696c65223a7b22246d6964223a312c22667350617468223a222f686f6d652f6d6861727269732f7261706964732f63757370617469616c2f2e646576636f6e7461696e65722f6375646131322e302d636f6e64612f646576636f6e7461696e65722e6a736f6e222c2270617468223a222f686f6d652f6d6861727269732f7261706964732f63757370617469616c2f2e646576636f6e7461696e65722f6375646131322e302d636f6e64612f646576636f6e7461696e65722e6a736f6e222c22736368656d65223a2266696c65227d7d.vscode-resource.vscode-cdn.net/home/coder/.conda/envs/rapids/lib/python3.10/site-packages/numba/cuda/dispatcher.py:538): NumbaPerformanceWarning: Grid size 1 will likely result in GPU under-utilization due to low occupancy.
warn(NumbaPerformanceWarning(msg))
I do get the error for your second example. but your example is wrong -- it creates a geoseries of points rather than a geoseries with a single polygon. Here's a corrected version:
import cuspatial
import numpy as np
polygon2coords = np.array([0,0,0,1,1,1,2,0,0,0]).astype('float')
polygon2 = cuspatial.GeoSeries.from_polygons_xy(polygon2coords, [0, 5], [0, 1], [0, 1])
point2coords = np.array([0.5,0.5]).astype('float')
point2 = cuspatial.GeoSeries.from_points_xy(point2coords)
result = cuspatial.point_in_polygon(point2, polygon2)
OK Mark thanks for the quick feedback. Based on that info, I concluded my conda env must be broken in some subtle way, so I went back to try another conda installation with cuda 11.2, this time for the entire rapids base package... and that appears to have worked.
I went back over what I did, and found the problem. I had started with the rapidsai installation matrix to come up with a conda command for cuda=11.8, python=3.10, cudf, cuspatial. This failed with some incompatibility errors. I then added on a --no-channel-priority, which had been mentioned on stack overflow in some different context i think, and that installed the conda env... but then my point-in-polygon call didn't work. I now understand the --no-channel-priority option somehow overshadowed and/or overrode conflict messages, thus permitting an incorrect install of what appears to a broken combo in the installation matrix for specific packages.
@epifanio I can explain the original out of memory (OOM) error. This is a regression from earlier versions (as you pointed out) because we added compatibility with GeoArrow GeoSeries. The problem is that for flat point arrays, we used to just be able to take the X and Y coordinates. But a GeoSeries is a DenseUnion type which has an array of types (one per row) and an array of offsets. So for 1B points, you have 16GiB for positions, 4GiB for offsets, and 1GiB for types. However because cuDF does not support Arrow Fixed-size List, we have to use a regular list, requires an additional indices buffer (which is identical to the DenseUnion offsets!). This adds a redundant 4GiB. So in all we have 25GiB storage for 1B points.
On my 32GiB V100 I can create the GeoSeries, but I OOM in the point_in_polygon
call because the internal code still destructures X and Y and so creates an additional copy of the position data, which goes over 32GiB for this example.
CC @isVoid @trxcllnt
Version
master
On which installation method(s) does this occur?
Source
Describe the issue
Memory error running
cuspatial.point_in_polygon
- same code return no errors when running fromcuspatial
v23.02
Minimum reproducible example
Return a memory error
Equivalent code, except the new API syntax:
Environment details
Other/Misc.
No response