sgoodm / python-distance-rasters

Generate distance raster given set of geometry features
BSD 3-Clause "New" or "Revised" License
18 stars 4 forks source link

Distance to value in raster instead of to shapefile #17

Closed ViriatoII closed 3 years ago

ViriatoII commented 3 years ago

Hey,

This package looks awesome! Is it possible to calculate distance to a pixel value in other raster?

For example, I have this binary raster of values 0/255. I'd like to calculate distance to 255 values:

image

It looks like it should be easy, since you transform any shapefile into a binary raster already, I just don't know what "rv_array and affine" should be then.

Cheers, Ricardo

sgoodm commented 3 years ago

Hey @ViriatoII,

Yes, you absolutely can use an existing raster. The rv_array and affine would be defined along the lines of:

import rasterio
with rasterio.open("your_raster.tif") as src:
    affine = src.transform()
    rv_array = src.read(1)

I'll add an example to the repo based on this approach for other folks to reference in the future. Thanks!

ViriatoII commented 3 years ago

That's great, thank you! 2 things:

~/.local/lib/python3.8/site-packages/distancerasters/utils.py in calc_haversine_distance(p1, p2) 243 * math.sin(delta_lon / 2) * 2 244 ) --> 245 c = 2 math.atan2(math.sqrt(a), math.sqrt(1 - a)) 246 # km 247 d = radius * c

ValueError: math domain error


Just to make sure, does this affine and rv_array look good?

Affine: | 1.00, 0.00, 0.00| | 0.00, 1.00, 0.00| | 0.00, 0.00, 1.00| rv_array: [[0 0 0 ... 0 0 0] [0 0 0 ... 0 0 0] [0 0 0 ... 0 0 0] # There are values of 255 here but they are a minority ... [0 0 0 ... 0 0 0] [0 0 0 ... 0 0 0] [0 0 0 ... 0 0 0]]



The problem must stem from my data. It's not geographical but cellular data, the coordinates vary between 0 and 15,000. I guess I should replace **calc_haversine_distance** with my own function.

Cheers.
Ricardo
sgoodm commented 3 years ago

Ah, yes the current implementation will always apply haversine distance calculations if an affine is provided and assumes it is just geographic, unprojected data. I would definitely like to improve this approach to work with other types of data and be a bit more intuitive.

A quick fix that may work for your case is to not provide an affine or output_path (e.g., just dr.DistanceRaster(rv_array)) in order to avoid the block of code using the haversine calc. You would just need to save the resulting distance array yourself using the export_raster function since the DistanceRaster class requires an affine in order to export as a raster.

ViriatoII commented 3 years ago

Thank you for the help. The function runs now but unfortunately after 2 hours it still hasn't finished. Perhaps my raster is too big (195 MB)? Using a similar function in imageJ (based in Java, I think) takes less than a minute to run.

sgoodm commented 3 years ago

What are your raster dimensions? Finer-resolution data over large areas are likely to take some time, and unfortunately will definitely be slower based on Python than Java.

I would like to produce some detailed performance/timing metrics in the future, but for now the best I can say is that I have produced a global ~1km resolution (36000x18000) distance raster which also took around a few hours.

ViriatoII commented 3 years ago

Hey @sgoodm

I was trying with a file that had 13080x15696 pixels (my analysis is at pixel resolution). Now I'm using the function on subsets of the data and it works beautifully.

I work in spatial transcriptomics in intra- and inter-cellular space. So here you have RNA and their distance to the nucleus of the cell.

image

For anyone searching for a simple solution like me without Geographical projections, I found 3 other functions that work with rasters as numpy arrays, explained in detail here:

With that I close the issue, thank you!

Ricardo