UDST / pandana

Pandas Network Analysis by UrbanSim: fast accessibility metrics and shortest paths, using contraction hierarchies :world_map:
http://udst.github.io/pandana
GNU Affero General Public License v3.0
385 stars 84 forks source link

Resolve Windows dtype error #154

Closed smmaurer closed 3 years ago

smmaurer commented 3 years ago

This PR documents and partially resolves a "buffer dtype mismatch" error that can show up in Windows.

Issue

When I set up the Pandana unit tests to run in a windows-latest GitHub Actions environment, I got the following error:

pandana\tests\test_cyaccess.py:35: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>   def __cinit__(
E   ValueError: Buffer dtype mismatch, expected 'long' but got 'long long'

Here's the full log: https://github.com/UDST/pandana/runs/1465674701

Diagnosis

The cyaccess() constructor raising the error takes NumPy data and passes it to C++ functions. A NumPy int is normally passed as a C++ long, but there seem to be inconsistencies in Windows and in this case it's being passed as a long long instead.

(My understanding is that the inconsistency is on the C++ side -- a NumPy int and a C++ long are usually both 64 bits, but in an environment where a long is only 32 bits, the int is treated as a long long.)

Solution

This PR fixes the tests without changing anything in Pandana. In the problematic unit test, we now cast the data to np.int_, which is a primitive dtype that's treated as a long on all platforms. (Read more here.) This approach should work in other downstream code as well.

It looks like a more thorough solution would be to change the C++ dtypes in Pandana from long to int64_t, which will always align with a 64-bit NumPy int. (See prior link and examples here and here). But that might require changes in a lot of places, so I'd rather not do it unless there's evidence that this is a widespread issue.