Esri / arcgis-python-api

Documentation and samples for ArcGIS API for Python
https://developers.arcgis.com/python/
Apache License 2.0
1.89k stars 1.1k forks source link

TypeError: float() argument must be a string or a number, not 'dict when calling sdf.spatial.to_featurelayer() #1672

Closed opike closed 1 year ago

opike commented 1 year ago

I have a python script that runs fine as a toolbox script in Desktop Pro, however when I try to run the same script in ArcGIS Online as a notebook, I get the following error:

TypeError: float() argument must be a string or a number, not 'dict'

for this line of code:

 new_feature_layer = sdf.spatial.to_featurelayer(
        title=f"Polys Centroids {tmp_uuid}",
        gis=gis,
        tags=["centroids", "outputs"],
        folder=TMP_FOLDER,
        sanitize_columns=False,
        service_name=f"polys_centroids_{tmp_uuid}",
    )

Here's the full stack trace for the error:

Stack trace

--------------------------------------------------------------------------- TypeError Traceback (most recent call last) /tmp/ipykernel_59/1392681058.py in () 2422 param7 2423 ) -> 2424 main(_args) /tmp/ipykernel_59/1392681058.py in main(_args) 64 print("PROCESSING FOR POLYS") 65 _args.set_is_polys(True) ---> 66 run_steps(_args) 67 68 /tmp/ipykernel_59/1392681058.py in run_steps(_args) 70 if not _args.is_test_mode: 71 print("Starting download step") ---> 72 download(_args) 73 print("Completed download step") 74 print("Starting join step...") /tmp/ipykernel_59/1392681058.py in download(_args) 133 layer_item = gis.content.get(layer_id) 134 if _args.is_polys: --> 135 polys_feature_layer = generate_points(_args) 136 output_file_centroids = polys_feature_layer.export( 137 title="Polys_with_centroids", /tmp/ipykernel_59/1392681058.py in generate_points(_args) 113 ) 114 print(tmp_uuid) --> 115 new_feature_layer = sdf.spatial.to_featurelayer( 116 title=f"Polys Centroids {tmp_uuid}", 117 gis=gis, /opt/conda/lib/python3.9/site-packages/arcgis/features/geo/_accessor.py in to_featurelayer(self, title, gis, tags, folder, sanitize_columns, service_name, **kwargs) 2844 ) 2845 -> 2846 result = content.import_data( 2847 self._data, 2848 folder=folder, /opt/conda/lib/python3.9/site-packages/arcgis/gis/__init__.py in import_data(self, df, address_fields, folder, item_id, **kwargs) 7562 ) 7563 -> 7564 ds = df.spatial.to_featureclass( 7565 location=os.path.join(temp_dir, name), 7566 sanitize_columns=sanitize_columns, /opt/conda/lib/python3.9/site-packages/arcgis/features/geo/_accessor.py in to_featureclass(self, location, overwrite, has_z, has_m, sanitize_columns) 2635 origin_columns = self._data.columns.tolist() 2636 origin_index = copy.deepcopy(self._data.index) -> 2637 result = to_featureclass( 2638 self, 2639 location=location, /opt/conda/lib/python3.9/site-packages/arcgis/features/geo/_io/fileops.py in to_featureclass(geo, location, overwrite, validate, sanitize_columns, has_m, has_z) 1106 return res 1107 else: -> 1108 res = _pyshp2(df=df, out_path=out_location, out_name=fc_name) 1109 df.set_index(old_idx) 1110 return res /opt/conda/lib/python3.9/site-packages/arcgis/features/geo/_io/fileops.py in _pyshp2(df, out_path, out_name) 1334 if value is np.nan: 1335 row[idx] = None -> 1336 shpfile.record(*row) 1337 del idx 1338 del row /opt/conda/lib/python3.9/site-packages/shapefile.py in record(self, *recordList, **recordDict) 1835 # Blank fields for empty record 1836 record = ["" for field in self.fields if field[0] != 'DeletionFlag'] -> 1837 self.__dbfRecord(record) 1838 1839 def __dbfRecord(self, record): /opt/conda/lib/python3.9/site-packages/shapefile.py in __dbfRecord(self, record) 1870 value = format(value, "d")[:size].rjust(size) # caps the size if exceeds the field size 1871 else: -> 1872 value = float(value) 1873 value = format(value, ".%sf"%deci)[:size].rjust(size) # caps the size if exceeds the field size 1874 elif fieldType == "D": TypeError: float() argument must be a string or a number, not 'dict'

And this is the surrounding code in the method where the error occurs:

def generate_points(_args):
    gis = _args.gis
    poly_feature_layer = gis.content.get("7da90f1a777c4d6e9d59e3314e70fbe5")
    print(poly_feature_layer["url"])
    poly_feature_layer2 = FeatureLayer(poly_feature_layer["url"] + "/0")
    # EPSG 4326 is a coordinate system for lat/long that makes used of the WGS84
    # spheroid
    sr = 4326
    df = poly_feature_layer2.query(
        as_df=True, return_centroid=True, return_geometry=False, out_sr=sr
    )
    print("Here 61")
    print(df.iloc[0])
    # Convert centroid to dict
    sdf = GeoAccessor.from_xy(
        df.join(pd.DataFrame(df.loc[:, "centroid"].to_dict()).T),
        "x",
        "y",
        sr=sr,
    )
    print("Here 62")
    print(sdf.iloc[0])
    sdf = sdf.drop("Correct_Area_SqM", axis=1)
    print("Here 63")
    print(sdf.iloc[0])
    # Create a shortened uuid to avoid name clashes.
    # We made want to add cleanup for all temporary layers.
    tmp_uuid = "".join(
        random.choice(string.ascii_letters + string.digits) for _ in range(6)
    )
    print(tmp_uuid)
    new_feature_layer = sdf.spatial.to_featurelayer(
        title=f"Polys Centroids {tmp_uuid}",
        gis=gis,
        tags=["centroids", "outputs"],
        folder=TMP_FOLDER,
        sanitize_columns=False,
        service_name=f"polys_centroids_{tmp_uuid}",
    )
    return new_feature_layer

Platform (please complete the following information):

nanaeaubry commented 1 year ago

@opike The python api version used by Online and by Pro can differ at times.

Try:

import arcgis
arcgis.__version__

to see what version is being using in each

nanaeaubry commented 1 year ago

Just for information: with the latest version 2.2.0 I tested your code on a SeDF with geometry type Polygon and the feature layer was successfully created.

opike commented 1 year ago

Ok, it's odd. I thought that maybe the issue was with version 2.1.0.2 and it was fixed in a later version. But I just confirmed that both the Desktop pro instance I'm using (where the code works) and the Online Notebook feature (where it's broken) are using 2.1.0.2.

Are you able to run your test using Online Notebooks?

opike commented 1 year ago

I took a look at the python version numbers and both are using 3.9.16 although Online Notebboks shows: 3.9.16 (main, Jan 11 2023, 16:05:54) And Desktop Pro Toolbox shows: 3.9.16 [MSC v.1931 64 bit (AMD64)]

nanaeaubry commented 1 year ago

This might be an issue with your data. I am able to run the code in Online Notebooks:

image

opike commented 1 year ago

That's what's perplexing, it's the same source data when running on Desktop Pro as in the Notebook.

Something that I hadn't mentioned yet which now I'm starting to think is related is that in Notebook I also get this warning:

/opt/conda/lib/python3.9/site-packages/arcgis/features/geo/_io/fileops.py:1108: UserWarning: Discarding nonzero nanoseconds in conversion.
  res = _pyshp2(df=df, out_path=out_location, out_name=fc_name)
Screenshot on 2023-09-25 at 08-47-42
nanaeaubry commented 1 year ago

Does your Online Notebook environment contain arcpy? You can test this by simply running: import arcpy

Otherwise the best thing in this case is to provide a subset of your data that we can test with because I am still unable to reproduce the error

opike commented 1 year ago

It isn't using arcpy in the Online Notebook, although it is used in the equivalent Toolbox script in desktop pro. There are basically 2 differences between the code in the Online Notebook and the Toolbox script: 1) Toolbox script uses arcpy.addMessage() and Notebook uses print() 2) Toolbox script reads in the parameters by using the parameters from Properties->Parameters whereas the Notebook has the parameters embedded in the script itself

Other than those two things, the scripts are using the same code and reading in the same source data from the same feature layers. I've checked the versions of arcgis, python and pandas and the versions are consistent between the 2 environments. However one difference I've determined with the environments is that the OS for Online Notebooks is Linux and for desktop Pro is Windows.

The field that the error message in Notebook is complaining about with the "must be a string or a number, not dict" error is the centroid field. If I drop that field, the error message goes away - however that's not a permanent solution since I need that field in the output layer.

The centroid field is indeed a dict. When I add the following print statements: print(sdf.iloc[0].centroid) print(type(sdf.iloc[0].centroid))

I get the output: {'x': -111.87445842127796, 'y': 33.75937225785404} <class 'dict'>

So this leads to the question, why does Desktop Pro allow that field to be a dict and Online Notebook doesn't?

I'd be willing to share the actual Notebook with you if that would help. Is there a good way for me to share it without making it fully public?

nanaeaubry commented 1 year ago

@opike

This might be an issue you have to bring to support as a case. I tried replicating your issue in different environments but cannot reproduce what you are getting in any of my cases. As you can see below I am using an ArcGIS Online Notebook and I have a spatially enabled dataframe where I added the centroid column with the same dictionary you sent.

Sorry I cannot help much more.

image

opike commented 1 year ago

ok, I'll go ahead and open a case. BTW, I came across this thread in the community forums where a similar issue was reported with a difference in behavior with a spatial join between a conda (notebook) environment and ArcGIS Pro: https://community.esri.com/t5/arcgis-api-for-python-questions/does-a-geoaccessor-sedf-spatial-join-require-arcpy/td-p/1168447

Unfortunately no solution was ultimately provided.

opike commented 1 year ago

I also came across this other github comment which looks like it might possibly be related to this issue: https://github.com/Esri/arcgis-python-api/issues/1068#issuecomment-911939332

It mentions difference in geometry engines - is there a different geometry engine being used between Online Notebooks and Desktop Pro?

achapkowski commented 1 year ago

yes, the geometry engines work like this:

  1. ArcPy - if present, this is used over all other engines. Full Geometry engine
  2. Shapely - this is used when value 1 isn't present. Shapely has a subset of method that work because it doesn't offer all the things arcpy can do.
  3. Nothing. - you do not have a geometry engine.
opike commented 1 year ago

By using the code provided in this section of the docs: https://developers.arcgis.com/python/guide/part1-introduction-to-sedf/#geometry-engines

I was able to determine that Online Notebooks don't have a geometry engine defined by default. Is there a way that one (either shapely or arcpy, but ideally arcpy) can be defined in the Notebook code?

achapkowski commented 1 year ago

You need to remap the notebook to an advanced runtime.

opike commented 1 year ago

It took me a little while to find where to change the runtime, but once I did, it fixed this error and some others that I was seeing. Thanks for your help!