geopandas / pyogrio

Vectorized vector I/O using OGR
https://pyogrio.readthedocs.io
MIT License
270 stars 22 forks source link

Encoding issues when installed from conda-forge #11

Open brendan-ward opened 3 years ago

brendan-ward commented 3 years ago
>>> df = read_dataframe('/vsizip//.../pyogrio/pyogrio/tests/fixtures/test_fgdb.gdb.zip', layer='test_areas')
Warning 1: Recode from CP437 to UTF-8 failed with the error: "Invalid argument".
(repeated many times)

on clean conda env on MacOS using pyogrio 0.2.

This does not raise similar errors when built from source on MacOS.

jorisvandenbossche commented 3 years ago

I just created a clean env on Linux (Ubuntu 20.04), and there I don't get such warnigs or errors:

In [7]: pyogrio.read_info('/vsizip///home/joris/scipy/repos/pyogrio/pyogrio/tests/fixtures/test_fgdb.gdb.zip')
Out[7]: 
{'crs': None,
 'encoding': 'UTF-8',
 'fields': array(['OBJECTID_1', 'FC_SEGMENT_ID', 'FC_SYSTEM_ID', 'ORG_ID',
        'PAL_STATUS_ID', 'SEGMENT_NAME', 'CONST_DATE_START',
        'CONST_DATE_END', 'NON_FED_IEI_DATE', 'DESIGN_FLOW',
        'DESIGN_FREQUENCY', 'FREEBOARD', 'FEMA_ACCREDITATION_DATE',
        'ENG_CERTIFICATION_DATE', 'COMMENTS', 'SUBMISSION_ID',
        'DISTRICT_ID', 'FIRM_PROTECTION_PROVIDED_IND', 'CERTIFICATION_ID',
        'RIPSTAT_ID', 'POTHAZARD_ID', 'Potential_Haz_Class_d',
        'Segment_Certification_d', 'FIRM_Protection_Provided_d'],
       dtype=object),
 'geometry_type': None,
 'features': 3}

In [9]: pyogrio.read_dataframe('/vsizip///home/joris/scipy/repos/pyogrio/pyogrio/tests/fixtures/test_fgdb.gdb.zip')
Out[9]: 
   OBJECTID_1  FC_SEGMENT_ID  FC_SYSTEM_ID  ORG_ID PAL_STATUS_ID  ... RIPSTAT_ID POTHAZARD_ID Potential_Haz_Class_d Segment_Certification_d FIRM_Protection_Provided_d
0         6.0   3.704000e+09           NaN     NaN          None  ...        1.0            1                   1.0                     2.0                        1.0
1         4.0   5.604100e+09           NaN     NaN           333  ...        1.0         None                   NaN                     2.0                        2.0
2         NaN            NaN           NaN     NaN          None  ...        NaN         None                   NaN                     NaN                        NaN

[3 rows x 24 columns]