Table query fails with "TypeError: Casting to unit-less dtype 'datetime64' is not supported. Pass e.g. 'datetime64[ns]' instead." #1777

Closed knoopum closed 1 day ago

knoopum commented 6 months ago

Describe the bug

Querying a table will unexpectedly throw the following error, when there is no result to return and editor tracking is enabled for the table:

TypeError: Casting to unit-less dtype 'datetime64' is not supported. Pass e.g. 'datetime64[ns]' instead.

To Reproduce

  1. Create an empty table in a hosted feature layer (e.g., from the New Item in ArcGIS Online Content)
  2. Add a field in which to enter some values for which to query.
  3. Add some data to the table.
  4. Enable editor tracking on the hosted feature layer.
  5. Use query from arcgis.features.Table to query for a value you did not enter, and the following error message will be returned:
TypeError                                 Traceback (most recent call last)
<timed exec> in <module>

/opt/conda/lib/python3.9/site-packages/arcgis/features/ in query(self, where, out_fields, time_filter, return_count_only, return_ids_only, return_distinct_values, group_by_fields_for_statistics, statistic_filter, result_offset, result_record_count, object_ids, gdb_version, order_by_fields, out_statistics, return_all_records, historic_moment, sql_format, return_exceeded_limit_features, as_df, having, **kwargs)
   4241             ):
   4242                 columns["SHAPE"] = object
-> 4243             df = pd.DataFrame([], columns=columns.keys()).astype(columns, True)
   4244             if "SHAPE" in df.columns:
   4245                 df["SHAPE"] = GeoArray([])

/opt/conda/lib/python3.9/site-packages/pandas/core/ in astype(self, dtype, copy, errors)
   6303                 else:
   6304                     try:
-> 6305                         res_col = col.astype(dtype=cdt, copy=copy, errors=errors)
   6306                     except ValueError as ex:
   6307                         ex.args = (

/opt/conda/lib/python3.9/site-packages/pandas/core/ in astype(self, dtype, copy, errors)
   6322         else:
   6323             # else, only a single dtype is given
-> 6324             new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
   6325             return self._constructor(new_data).__finalize__(self, method="astype")

/opt/conda/lib/python3.9/site-packages/pandas/core/internals/ in astype(self, dtype, copy, errors)
    449             copy = False
--> 451         return self.apply(
    452             "astype",
    453             dtype=dtype,

/opt/conda/lib/python3.9/site-packages/pandas/core/internals/ in apply(self, f, align_keys, **kwargs)
    350                 applied = b.apply(f, **kwargs)
    351             else:
--> 352                 applied = getattr(b, f)(**kwargs)
    353             result_blocks = extend_blocks(applied, result_blocks)

/opt/conda/lib/python3.9/site-packages/pandas/core/internals/ in astype(self, dtype, copy, errors, using_cow)
    509         values = self.values
--> 511         new_values = astype_array_safe(values, dtype, copy=copy, errors=errors)
    513         new_values = maybe_coerce_values(new_values)

/opt/conda/lib/python3.9/site-packages/pandas/core/dtypes/ in astype_array_safe(values, dtype, copy, errors)
    241     try:
--> 242         new_values = astype_array(values, dtype, copy=copy)
    243     except (ValueError, TypeError):
    244         # e.g. _astype_nansafe can fail on object-dtype of strings

/opt/conda/lib/python3.9/site-packages/pandas/core/dtypes/ in astype_array(values, dtype, copy)
    186     else:
--> 187         values = _astype_nansafe(values, dtype, copy=copy)
    189     # in pandas we don't store numpy str dtypes, so convert to object

/opt/conda/lib/python3.9/site-packages/pandas/core/dtypes/ in _astype_nansafe(arr, dtype, copy, skipna)
    114             dti = to_datetime(arr.ravel())
    115             dta = dti._data.reshape(arr.shape)
--> 116             return dta.astype(dtype, copy=False)._ndarray
    118         elif is_timedelta64_dtype(dtype):

/opt/conda/lib/python3.9/site-packages/pandas/core/arrays/ in astype(self, dtype, copy)
    692             and is_unitless(dtype)
    693         ):
--> 694             raise TypeError(
    695                 "Casting to unit-less dtype 'datetime64' is not supported. "
    696                 "Pass e.g. 'datetime64[ns]' instead."

TypeError: Casting to unit-less dtype 'datetime64' is not supported. Pass e.g. 'datetime64[ns]' instead.

Expected behavior

An empty result should be returned when there is no match.

Platform (please complete the following information):

ArcGIS Online Notebooks (ArcGIS Notebook Python 3 Standard - 9.0)

nanaeaubry commented 6 months ago

@knoopum What version of the python api are you using?

knoopum commented 6 months ago

@nanaeaubry whichever version Esri has packaged in ArcGIS Online Notebooks (ArcGIS Notebook Python 3 Standard - 9.0)

tyhranac commented 5 months ago

@knoopum I followed the steps to reproduce using the ArcGIS Notebook Python 3 Standard - 9.0 runtime in ArcGIS Online Notebooks without issue. Can you post a code snippet?

Sidenote: You can check the arcgis versions in your ArcGIS Online Notebook by running

import arcgis

or navigating to Info > Runtime in the Notebook UI notebookinfo It should be

knoopum commented 5 months ago

@tyhranac Sorry! I left off an important part of the instructions to reproduce. In step 5, when you query the table, set as_df = True.

Querying for all the records works: image

Querying for a value that exists in the data also works: image

Querying for a value that does not exist in the data generates the error: image

VFRoderick commented 4 months ago

I've been running into this issue too. I have a feature layer collection that has empty data but query with as_df against Feature Layer types works and queries against tables results in the same error. This is on arcgis.gis verions ''.

I think the issue lies in that the "Fields" array that details all the different fields and their data types when returning just a featureset is included in for Feature Layer types but this isn't the case for Tables. So when you have date fields in a table it autos unit-less dtype.

Interestingly if you query to make it a featureset and then use the sdf method on the featureset to return a dataframe it works but returns essentially an empty dataframe with no columns or dtypes.

My work around is basically capturing the "Fields" array from the properties of the table, making a new dictionary that mimics the table featureset structure and injects the Fields array into it. Then when I create a Dataframe from this updated featureset it understands that the datetime fields should be 'datetime64[ns].

flc_master = WF_Data_All_Item.layers[0].container
wf_data = {}
for  lyr in flc_master.layers + flc_master.tables:
    _obj = {}
    _id =
    _name =
    _recCount = lyr.query(where='1=1', out_fields='*',return_count_only=True)
    _type =
    _obj['id'] = _id
    _obj['name'] = _name
    _obj['record count'] = _recCount
    _obj['lyr'] = lyr
    _obj['type'] = _type
    if _recCount == 0:
        if _type == 'Feature Layer':
            df = lyr.query(where='1=1', out_fields='*',as_df=True)
            _obj['df'] = df
        if _type == 'Table':
                df = lyr.query(where='1=1', out_fields='*',as_df=True)
            except Exception as e: 
                print("Query as_df worked")
                    df = lyr.query(where='1=1', out_fields='*').sdf
                except Exception as e:
                    print("FeatureSet as sdf worked")
            fields =
            fs = lyr.query(where='1=1', out_fields='*').to_dict()
            new_dict = {}
            for key,value in fs.items():
                if key == 'fields':
                    new_dict[key] = fields
                    new_dict[key] = value
            fs_new = FeatureSet.from_dict(new_dict)
            df = fs_new.sdf

This outputs:

Casting to unit-less dtype 'datetime64' is not supported. Pass e.g. 'datetime64[ns]' instead.
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 0 entries
Empty DataFrame
Empty DataFrame
Columns: []
Index: []
FeatureSet as sdf worked
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 0 entries
Data columns (total 12 columns):
 #   Column         Non-Null Count  Dtype         
---  ------         --------------  -----         
 0   name           0 non-null      string        
 1   contactnumber  0 non-null      string        
 2   userid         0 non-null      string        
 3   GlobalID       0 non-null      string        
 4   wfprivileges   0 non-null      string        
 5   CreationDate   0 non-null      datetime64[us]
 6   Creator        0 non-null      string        
 7   EditDate       0 non-null      datetime64[us]
 8   Editor         0 non-null      string        
 9   Project        0 non-null      string        
 10  OG             0 non-null      string        
 11  OBJECTID       0 non-null      Int64         
dtypes: Int64(1), datetime64[us](2), string(9)
memory usage: 124.0 bytes
Empty DataFrame
Columns: [name, contactnumber, userid, GlobalID, wfprivileges, CreationDate, Creator, EditDate, Editor, Project, OG, OBJECTID]
Index: []
Casting to unit-less dtype 'datetime64' is not supported. Pass e.g. 'datetime64[ns]' instead.
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 0 entries
Empty DataFrame
Empty DataFrame
Columns: []
Index: []
FeatureSet as sdf worked
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 0 entries
Data columns (total 10 columns):
 #   Column        Non-Null Count  Dtype         
---  ------        --------------  -----         
 0   description   0 non-null      string        
 1   GlobalID      0 non-null      string        
 2   CreationDate  0 non-null      datetime64[us]
 3   Creator       0 non-null      string        
 4   EditDate      0 non-null      datetime64[us]
 5   Editor        0 non-null      string        
 6   Sequence      0 non-null      Int32         
 7   Project       0 non-null      string        
 8   OG            0 non-null      string        
 9   OBJECTID      0 non-null      Int64         
dtypes: Int32(1), Int64(1), datetime64[us](2), string(6)
memory usage: 124.0 bytes
Empty DataFrame
Columns: [description, GlobalID, CreationDate, Creator, EditDate, Editor, Sequence, Project, OG, OBJECTID]
Index: []
achapkowski commented 4 months ago

@knoopum can you please export your table to a FGDB and share it? or share the URL?

nanaeaubry commented 1 day ago

We put in a fix that has been released. Closing due to this being addressed. Feel free to comment or reopen if needed