Open dinosg opened 3 years ago
See #41. I think we should make the error message more useful, but this is likely a problem with your source or target geometries containing overlaps. If these are indeed Census blocks/vtds, then you likely have duplicates. Let me know if this helps!
these ARE census blocks being mapped to VTD's. However the VTD's (for Texas, from the MGGG state archive for 2010) got 'buffered' to avoid a point defect that prevented a Graph getting made. possibly using the straight MGGG vtd archive for TX could be a workaround - we'll see.
Idea being to then map the 2010 census blocks to the 2020 census vtd's so I can have a database with the 2010 AND 2020 population stats all in 1 place so I can do interesting population change comparisons
VEST already did this, I believe.
you have a link for that repo? What I see at the general link https://dataverse.harvard.edu/file.xhtml?fileId=5007853&version=17.0 is stuff on 2020 election results but not obviously combining 2010 and 2020 demographics. Missing PA incidentally. I just got the comprehensive precinct results from PA sec'y of state - anyone I should send those to so they can integrate it with other datasets?
there also is their repo with "crosswalks" https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/T9VMJO but not updated w/ 2020 census pop data, just the block shapes and 2019 ACS data.
Sorry for the delay -- I thought that VEST had this data prepared, but maybe not.
I had this same issue. I checked the axes of my dataframes and no duplicates exist.
ValueError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_3132/626298941.py in
~\anaconda3\lib\site-packages\maup\crs.py in wrapped(*args, *kwargs) 12 ) 13 ) ---> 14 return f(args, **kwargs) 15 16 return wrapped
~\anaconda3\lib\site-packages\maup\assign.py in assign(sources, targets) 10 target that covers the most of its area. 11 """ ---> 12 assignment = assign_by_covering(sources, targets) 13 unassigned = sources[assignment.isna()] 14 assignments_by_area = assign_by_area(unassigned, targets)
~\anaconda3\lib\site-packages\maup\assign.py in assign_by_covering(sources, targets) 20 def assign_by_covering(sources, targets): 21 indexed_sources = IndexedGeometries(sources) ---> 22 return indexed_sources.assign(targets) 23 24
~\anaconda3\lib\site-packages\maup\indexed_geometries.py in assign(self, targets) 46 ) 47 ] ---> 48 assignment = pandas.concat(groups).reindex(self.index) 49 return assignment 50
~\anaconda3\lib\site-packages\pandas\core\series.py in reindex(self, index, kwargs) 4578 ) 4579 def reindex(self, index=None, kwargs): -> 4580 return super().reindex(index=index, **kwargs) 4581 4582 @deprecate_nonkeyword_arguments(version=None, allowed_args=["self", "labels"])
~\anaconda3\lib\site-packages\pandas\core\generic.py in reindex(self, *args, **kwargs) 4816 4817 # perform the reindex on the axes -> 4818 return self._reindex_axes( 4819 axes, level, limit, tolerance, method, fill_value, copy 4820 ).finalize(self, method="reindex")
~\anaconda3\lib\site-packages\pandas\core\generic.py in _reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy) 4837 4838 axis = self._get_axis_number(a) -> 4839 obj = obj._reindex_with_indexers( 4840 {axis: [new_index, indexer]}, 4841 fill_value=fill_value,
~\anaconda3\lib\site-packages\pandas\core\generic.py in _reindex_with_indexers(self, reindexers, fill_value, copy, allow_dups) 4881 4882 # TODO: speed up on homogeneous DataFrame objects -> 4883 new_data = new_data.reindex_indexer( 4884 index, 4885 indexer,
~\anaconda3\lib\site-packages\pandas\core\internals\managers.py in reindex_indexer(self, new_axis, indexer, axis, fill_value, allow_dups, copy, consolidate, only_slice) 668 # some axes don't allow reindexing with dups 669 if not allow_dups: --> 670 self.axes[axis]._validate_can_reindex(indexer) 671 672 if axis >= self.ndim:
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in _validate_can_reindex(self, indexer) 3783 # trying to reindex on an axis with duplicates 3784 if not self._index_as_unique and len(indexer): -> 3785 raise ValueError("cannot reindex from a duplicate axis") 3786 3787 def reindex(
ValueError: cannot reindex from a duplicate axis
the shapefiles I used were at: https://github.com/mggg-states/TX-shapefiles
then maup.assign just crashes... after spending a while getting thru the assignments. example:
In [10]: assign1 = maup.assign(blocks20, vtds10) 100%|██████████| 8941/8941 [11:36<00:00, 12.85it/s] Traceback (most recent call last):
File "", line 1, in
assign1 = maup.assign(blocks20, vtds10)
File "/Users/dpg/opt/anaconda3/lib/python3.7/site-packages/maup/crs.py", line 14, in wrapped return f(*args, **kwargs)
File "/Users/dpg/opt/anaconda3/lib/python3.7/site-packages/maup/assign.py", line 12, in assign assignment = assign_by_covering(sources, targets)
File "/Users/dpg/opt/anaconda3/lib/python3.7/site-packages/maup/assign.py", line 22, in assign_by_covering return indexed_sources.assign(targets)
File "/Users/dpg/opt/anaconda3/lib/python3.7/site-packages/maup/indexed_geometries.py", line 42, in assign assignment = pandas.concat(groups).reindex(self.index)
File "/Users/dpg/opt/anaconda3/lib/python3.7/site-packages/pandas/core/series.py", line 4579, in reindex return super().reindex(index=index, **kwargs)
File "/Users/dpg/opt/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py", line 4810, in reindex axes, level, limit, tolerance, method, fill_value, copy
File "/Users/dpg/opt/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py", line 4834, in _reindex_axes allow_dups=False,
File "/Users/dpg/opt/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py", line 4880, in _reindex_with_indexers copy=copy,
File "/Users/dpg/opt/anaconda3/lib/python3.7/site-packages/pandas/core/internals/managers.py", line 663, in reindex_indexer self.axes[axis]._validate_can_reindex(indexer)
File "/Users/dpg/opt/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3785, in _validate_can_reindex raise ValueError("cannot reindex from a duplicate axis")
ValueError: cannot reindex from a duplicate axis