PermafrostDiscoveryGateway / viz-staging

PDG Visualization staging pipeline
Apache License 2.0
2 stars 1 forks source link

Update label_duplicates to no longer rely on chained assignment #43

Open westminsterabi opened 3 months ago

westminsterabi commented 3 months ago

When running the workflow on the lake change data, I got the following warning:

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  super().__setitem__(key, value)
  File "/Users/abihunter/projects/viz-info/helpful-code/workflow.py", line 167, in <module>
    stager.stage_all()
  File "/Users/abihunter/projects/viz-staging/pdgstaging/TileStager.py", line 111, in stage_all
    self.stage(path)
  File "/Users/abihunter/projects/viz-staging/pdgstaging/TileStager.py", line 145, in stage
    self.save_tiles(gdf)
  File "/Users/abihunter/projects/viz-staging/pdgstaging/TileStager.py", line 476, in save_tiles
    data = self.combine_and_deduplicate(data, tile_path)
  File "/Users/abihunter/projects/viz-staging/pdgstaging/TileStager.py", line 687, in combine_and_deduplicate
    gdf = dedup_method(gdf, **dedup_config)
  File "/Users/abihunter/projects/viz-staging/pdgstaging/Deduplicator.py", line 421, in deduplicate_neighbors
    to_return = label_duplicates(to_return, prop_duplicated)
  File "/Users/abihunter/projects/viz-staging/pdgstaging/Deduplicator.py", line 682, in label_duplicates
    duplicates[prop_duplicated] = True
  File "/Users/abihunter/projects/viz-info/env/lib/python3.11/site-packages/geopandas/geodataframe.py", line 1525, in __setitem__
    super().__setitem__(key, value)
  File "/Users/abihunter/projects/viz-info/env/lib/python3.11/site-packages/pandas/core/frame.py", line 3980, in __setitem__
    self._set_item(key, value)
  File "/Users/abihunter/projects/viz-info/env/lib/python3.11/site-packages/pandas/core/frame.py", line 4187, in _set_item
    self._set_item_mgr(key, value)
  File "/Users/abihunter/projects/viz-info/env/lib/python3.11/site-packages/pandas/core/frame.py", line 4152, in _set_item_mgr
    self._check_setitem_copy()
  File "/Users/abihunter/projects/viz-info/env/lib/python3.11/site-packages/pandas/core/generic.py", line 4213, in _check_setitem_copy
    warnings.warn(t, SettingWithCopyWarning, stacklevel=find_stack_level())
  File "/opt/homebrew/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/warnings.py", line 109, in _showwarnmsg
    sw(msg.message, msg.category, msg.filename, msg.lineno,
  File "/Users/abihunter/projects/viz-info/helpful-code/workflow.py", line 39, in warn_with_traceback
    traceback.print_stack(file=log)
/Users/abihunter/projects/viz-info/env/lib/python3.11/site-packages/geopandas/geodataframe.py:1525: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

We should double check that this code is doing what we want/expect, and update it to silence the warning.

westminsterabi commented 3 months ago

This article has more info: https://www.dataquest.io/blog/settingwithcopywarning/