urschrei / convertbng

Fast, accurate WGS84 ⬅️➡️ OSGB36 (OSTN15) conversion, using Python and Rust
https://pypi.python.org/pypi/convertbng
Other
37 stars 7 forks source link

Ubuntu 16.04 64bit: trap invalid opcode ... in liblonlat_bng-783571af.so #2

Open chongyangshi opened 7 years ago

chongyangshi commented 7 years ago

Hi there,

I have a flask application which uses convertbng to convert some OSGB36 eastings northings returned from a sqlite query into WGS84 coordinates. I just experimentally redeployed the production code running fine on Ubuntu 14.04 64bit (E3-1225 v2) onto a Ubuntu 16.04 64bit environment (i5-3570S).

The version of Python is 2.7.12, the version of convertbng is 0.5.5. The kernel is Linux 4.4.0-64-generic #85-Ubuntu.

Everything works fine, except this conversion:

    Feb 26 22:28:28 (hostname) kernel: [180221.207469] traps: uwsgi[22644] trap invalid opcode ip:7f32a27968f2 sp:7fff2f6a6bc8 error:0 in liblonlat_bng-783571af.so[7f32a272d000+1d2a000]

This in turn causes an Illegal instruction (core dumped) by Python, and in turn causes a remote host closing connection prematurely by NGINX which is running reverse proxy of the flask application.

While I don't speak Rust, it appears that this is something to do with the Rust binary called by the library?

For reference, the Python code that can reproduce this error is:

from convertbng.util import convert_lonlat

def bng_to_longlat(bng):
    """ Given a pair of BNG coordinates return its long and lat coordinates. """

    try:
        easting = [int(bng[0])]
        northing = [int(bng[1])]
        coordinates = convert_lonlat(easting, northing)
        coordinate_long = coordinates[0][0]
        coordinate_lat = coordinates[1][0]

        if isnan(coordinate_long) or isnan(coordinate_lat):
            return False

        return (coordinate_long, coordinate_lat)

    except ValueError:
        return False

bng_to_longlat((20000,20000))

Thanks.

urschrei commented 7 years ago

Just to confirm:

chongyangshi commented 7 years ago

Just checked:

With this:

from convertbng.util import convert_lonlat, convert_bng
from numpy import isnan

def bng_to_longlat(bng):
    """ Given a pair of BNG coordinates return its long and lat coordinates. """

    try:
        easting = [int(bng[0])]
        northing = [int(bng[1])]
        coordinates = convert_lonlat(easting, northing)
        coordinate_long = coordinates[0][0]
        coordinate_lat = coordinates[1][0]

        if isnan(coordinate_long) or isnan(coordinate_lat):
            return False

        return (coordinate_long, coordinate_lat)

    except ValueError:
        return False

#print bng_to_longlat((20000,20000))
#print bng_to_longlat((315877,781709))
print convert_bng([-7.30034511], [49.95888852])
urschrei commented 7 years ago

Hmmm that's extremely confusing. If you change your import to
from convertbng.cutil import convert_lonlat
is the error triggered?

chongyangshi commented 7 years ago

With convertbng.cutil, convert_lonlat still causes Illegal Instruction while convert_bng still seems to work.

As the library was simply installed by doing pip install convertbng under the virtualenv from the preloaded python 2.7 of the distribution, could this be an issue with rustc and this particular CPU?

I think I'll try deploying it in a 16.04 VM at some time, just in case it's the CPU.

urschrei commented 7 years ago

If it's an x86 CPU, there's no reason it shouldn't work, although a dump of the system info might help. I haven't had a chance to try to reproduce this using 16.04 yet.

chongyangshi commented 7 years ago

Just tested the same code in a KVM VPS running 16.04 64bit, and it worked without issue. Therefore it is highly likely that something does not sing well between i5-3570S and Rust.

I'll find some time to look into this weird thing, thanks for your time.

chongyangshi commented 7 years ago

Okay, thanks for patching up the other repo. I have built a new liblonlat_bng.so from lonlat_bng, replaced the rust binary that pip installed ({virtualenv environment}/lib/python2.7/site-packages/convertbng/.libs/liblonlat_bng-783571af.so) with liblonlat_bng.so generated in lonlat_bng/target/release/liblonlat_bng.so , and run the Python script again. It worked.

As the liblonlat_bng-783571af.so binary from PyPI version 5.5 is still broken for this processor, I wonder if it is suitable to rebuild a binary and push to PyPI?

Thanks a lot for the assistance.

urschrei commented 7 years ago

The problem is that the binary for x86_64 binary is built in a manylinux1 VM in Travis in order to be widely compatible, so I can't easily build one that's going to work on this specific processor. I'm still curious as to what's going wrong, but I'm unsure of what my next port of call is. Could you put up the working .so somewhere I can grab it?

-- steph

On 27 Feb 2017, at 21:53, C Shi notifications@github.com wrote:

Okay, thanks for patching up the other repo. I have built a new liblonlat_bng.so from lonlat_bng, replaced the rust binary that pip installed ({virtualenv environment}/lib/python2.7/site-packages/convertbng/.libs/liblonlat_bng-783571af.so) with liblonlat_bng.so generated in lonlat_bng/target/release/liblonlat_bng.so , and run the Python script again. It worked.

As the liblonlat_bng-783571af.so binary from PyPI version 5.5 is still broken for this processor, I wonder if it is suitable to rebuild a binary and push to PyPI?

Thanks a lot for the assistance.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

chongyangshi commented 7 years ago

I totally understand. For something as strange as this, it is unlikely that a build tool can guarantee something working on this processor. I am however happy to test any binary coming out of manylinux.

The working binary can be found at https://mega.nz/#!0U9kQAab!2QYUjK6yRyZ837EMzD_N4xM1s9W4ZrLRSN4hYikrgWw for your inspection.

Cheers.