Esri / arcgis-python-api

Documentation and samples for ArcGIS API for Python
https://developers.arcgis.com/python/
Apache License 2.0
1.89k stars 1.1k forks source link

GeoAccessor.from_table Returns TypeError on SHORT Field with NULL and skip_nulls=False #829

Closed bixb0012 closed 2 years ago

bixb0012 commented 4 years ago
>>> import arcpy
>>> import arcgis
>>> import pandas as pd
>>> from arcgis.features import GeoAccessor, GeoSeriesAccessor
>>> 
>>> arcgis.__version__
'1.8.3'
>>> 
>>> tbl =  # path to FGDB table
>>> arcpy.management.GetCount(tbl)
<Result '4'>
>>> 
>>> # call from_table using implied all-fields wildcard
>>> pd.DataFrame.spatial.from_table(tbl)
Empty DataFrame
Columns: [OBJECTID, Name1, Code, float_field, EventDate, Name2, Int_Field, Combo_text, Name, dbl_field, API_Text, space_field]
Index: []
>>> 
>>> # call from_table using explicit all-fields wildcard
>>> pd.DataFrame.spatial.from_table(tbl, fields="*")
Empty DataFrame
Columns: [OBJECTID, Name1, Code, float_field, EventDate, Name2, Int_Field, Combo_text, Name, dbl_field, API_Text, space_field]
Index: []
>>> 
>>> # call from_table using specific fields in list
>>> pd.DataFrame.spatial.from_table(tbl, fields=['OBJECTID', 'Name1', 'Code', 'float_field'])
   OBJECTID               Name1  Code  float_field
0         1      Delaware\r\nme     1  2016.000000
1         2          Delaware's     2     7.000000
2         3  Delaware\nsay what     0    16.799999
3         4            Del'ware     0     0.000000
>>> 

UPDATE: The root cause appears to be having a field in the table of data type SHORT. If I remove any fields with that data type, everything works as expected. If I add a SHORT field to the table, it generates empty data frame.

achapkowski commented 3 years ago

@bixb0012 Thanks for reporting this, I'll take a look.

achapkowski commented 3 years ago

@bixb0012 what version of ArcGIS Pro are you using?

bixb0012 commented 3 years ago

I am using Pro 2.7 beta 1 that has ArcGIS API for Python 1.8.3

It has to do with NULLs in SHORT fields. If I isolate a SHORT field and leave skip_nulls=True, I get expected results. If I isolate a SHORT field and have skip_nulls=False, I get the following

pd.DataFrame.spatial.from_table(tbl, fields="shosrt_field", skip_nulls=False)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\features\geo\_accessor.py", line 2422, in from_table
    return from_table(filename, **kwargs)
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\features\geo\_io\fileops.py", line 257, in from_table
    null_value=null_value))
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
achapkowski commented 3 years ago

ok, thanks, this is good information. The root cause is actually: arcpy.da.TableToNumPyArray, I'll see what I can do.

achapkowski commented 3 years ago

Any chance you can share your table?

bixb0012 commented 3 years ago

Regarding the original issue, the titular issue (I updated title to reflect actual/reproducible issue), I can't reproduce it today starting from scratch. There must have been something I missed that was causing the empty data frame when using "all fields." Honestly, it may have been a matrix of NULLs that simply resulted in no records being returned when skip_nulls=True.

Setting the original/titular issue aside, I can easily reproduce the TypeError that results from having a SHORT field with a NULL. Here is some basic code to reproduce the issue:

>>> import os
>>> import pandas as pd
>>> import arcpy
>>> from arcgis.features import GeoAccessor, GeoSeriesAccessor
>>>
>>> recs = iter([
...     ["Line 1", 1],
...     ["Line 2", 2],
...     ["Line 3", 3],
...     [None, 4],
...     ["Line 5", None]
... ])
>>>
>>> # create table
>>> sgdb = arcpy.env.scratchGDB
>>> tbl = arcpy.CreateTable_management(
...     *os.path.split(arcpy.CreateScratchName(workspace=sgdb))
... )
>>> res = arcpy.AddField_management(tbl, "TXT_FLD", "TEXT")
>>> res = arcpy.AddField_management(tbl, "SHORT_FLD", "SHORT")
>>>
>>> # populate with 3 non-null records, check GeoAccessor.from_table results
>>> with arcpy.da.InsertCursor(tbl, ["TXT_FLD", "SHORT_FLD"]) as cur:
...     for _ in range(3):
...         res = cur.insertRow(next(recs))
...
>>> pd.DataFrame.spatial.from_table(tbl, skip_nulls=False)
   OBJECTID TXT_FLD  SHORT_FLD
0         1  Line 1          1
1         2  Line 2          2
2         3  Line 3          3
>>>
>>> # populate with 1 record with NULL text field, check GeoAccessor.from_table results
>>> with arcpy.da.InsertCursor(tbl, ["TXT_FLD", "SHORT_FLD"]) as cur:
...     res = cur.insertRow(next(recs))
...
>>> pd.DataFrame.spatial.from_table(tbl, skip_nulls=False)
   OBJECTID TXT_FLD  SHORT_FLD
0         1  Line 1          1
1         2  Line 2          2
2         3  Line 3          3
3         4    None          4
>>>
>>> # populate with 1 record with NULL short field, check GeoAccessor.from_table results
>>> with arcpy.da.InsertCursor(tbl, ["TXT_FLD", "SHORT_FLD"]) as cur:
...     res = cur.insertRow(next(recs))
...
>>> pd.DataFrame.spatial.from_table(tbl, skip_nulls=False)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\features\geo\_accessor.py", line 2422, in from_table
    return from_table(filename, **kwargs)
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\features\geo\_io\fileops.py", line 257, in from_table
    null_value=null_value))
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
>>>