Turbo87 / utm

Bidirectional UTM-WGS84 converter for python
http://pypi.python.org/pypi/utm
MIT License
493 stars 101 forks source link

Wrong results when using utm.from_latlon(lats, lons) on numpy arrays. #104

Open m4mlo opened 1 year ago

m4mlo commented 1 year ago

Hello all, hi @Turbo87, @bartvanandel, @tibnor,

I have around 20M latitude longitude pairs to convert to UTM. While the performance of the conversion utm.from_latlon() is stunning, I noticed some inconsistent results when using utm.from_latlon(lats, lons) on numpy arrays compared to element-wise conversion. The code below should give the same result for result[0] and test_utmx. However, this is not the case (you can run the code if you replace df['Latitude'] and df['Longitude'] by some randomly generated lat lon values). For 100k lat,lon pairs around 100 conversions are wrong when using utm.from_latlon(lats, lons) on a numpy array. Numpy version is '1.21.5', utm-0.7.0

nx = 100000
lats = np.array(df['Latitude'])
lons = np.array(df['Longitude'])
result = utm.from_latlon(lats, lons)
test_utmx = []
test_utmy = []

for ii in np.arange(nx):
    test_utmx.append(utm.from_latlon(lats[ii],lons[ii])[0])
    test_utmy.append(utm.from_latlon(lats[ii],lons[ii])[1])

plt.plot(np.arange(0,1.1,0.1),np.arange(0,1.1,0.1),'--',c='grey')
plt.plot(test_utmx/np.max(test_utmx), result[0][0:nx]/np.max(test_utmx),'x')
np.where(np.abs(test_utmx - result[0][0:nx])>1)

any feedback on this issue would be appreciated.

bartvanandel commented 1 year ago

Are all your points in the same zone? If not, that may be the reason.

See for example https://github.com/Turbo87/utm/blob/master/utm/conversion.py#L291

m4mlo commented 1 year ago

Hello @bartvanandel ,

You are right, different zones might be the reason why the results are different for an array operation vs. element-wise calculation. Some of my locations are in Australia, some are in Europe. How to overcome the limitation of having latitude and longitude pairs in different zones while being in the same array? Is it necessary to first divide the locations according to the zones. Thanks for your help on this.

bartvanandel commented 1 year ago

At this point, you'll have to split to get accurate results, I'm afraid. Or loop manually.

Note that there are valid reasons for the existing approach. Performance is likely one of them. Note that I've only contributed marginally to this project, I didn't personally write the code I've referred to.

If you think there's a better way thay should be accommodated by this package, you can always create a pull request with your proposed changes or additions.