harsha2010 / magellan

Geo Spatial Data Analytics on Spark
Apache License 2.0
533 stars 149 forks source link

obtaining a list of geohashes representing neighborhoods zones #234

Open IsamAljawarneh opened 5 years ago

IsamAljawarneh commented 5 years ago

Hi Ram, very interesting project. so, i have a DF with neighborhoods represented as ZCurves in index column:

var index1 = neighborhoods.withColumn("index", $"polygon" index 30) how to obtain the list of geohashes that represent each zone instead. same issue as mentioned here: https://github.com/harsha2010/magellan/issues/193 he said he could solve it using the toBase32 function but he did not mention how.

can you tell me how to use this functionality (for example with my index1 dataframe above)

thanks a lot and wishing all the best for the project.

harsha2010 commented 5 years ago

hey! A ZOrderCurve has a function called .toBase32() that returns the geo hash string (which is nothing but a base 32 encoded string). So all you'd need to do is write a UDF that does something like geohashUDF = udf{(curve: Seq[ZOrderCurve]) => curve.map(_.toBase32())} and use it as: df.withColumn("geohashes", geohashUDF($"index.curve"))

I haven't tested the above syntax but morally it should do the right thing...

IsamAljawarneh commented 5 years ago

Thanks Ram. i will test it as soon as possible and let you know.