Closed jGaboardi closed 1 week ago
How come the sum of value counts above is different?
How come the sum of value counts above is different?
In [1]: (549 + 125 + 95) == (549 + 209 + 11)
Out[1]: True
In [2]: (549 + 125 + 95)
Out[2]: 769
In [3]: (549 + 209 + 11)
Out[3]: 769
They are equivalent, no?
I can't count....
So if x
is passed in as a list
then the condition can potentially evaluate as True
(otherwise the condition will always evaluate to False
no matter if 'new'
is present). But after combing through the code base I am not finding anywhere we are doing that. @martinfleis Any more thoughts off the top of your head here?
While this does not have a direct impact on geometries being generated, it does have a meaningful impact on geometry labeling, which then could affect the algorithm (from my understanding).
I'd say we need to figure this out before I continue with testing/refactor and transfer functionality to simplification.core
.
No idea...
But after combing through the code base I am not finding anywhere we are doing that.
Confirmed programmatically there are no instances of passing in a list
So I guess we need to determine if our logic of "new" vs. "changed" is sound. Currently:
def _status(x: pd.Series) -> str:
"""Determine the status of edge line(s)."""
if len(x) == 1:
return x.iloc[0]
if "new" in x:
# This logic is here just to be safe. It will be hit if we create a new line
# and in a subsequent step extend it which is not what normally happens.
# All the new bits are caught likely by the first ``if``.
return "new"
return "changed"
... a linestring geometry is considered "new" only if len(x) == 1
and "new" is that value. Since the second conditional is flawed.
So:
@martinfleis @anastassiavybornova Do yall concur with this logic?
I think that's right, yes.
Following #83 (and @martinfleis's explanation there) I want to triple check the correctness of the
'new' in ...
condition innodes._status()
. Currently this will never evaluate toTrue
since it is passed in as apandas.Series
:For the small test case of Apalachicola, FL the results of
'_status'
change as follows:current
main
--'new' in x
:updated condition --
'new' in x.values
:This of course affects the full scale FUA testing for
'_status'
labeling.xref:
92
The plot here highlights the difference where:
It does appear that the results from the original condition are the most correct where lines that are extended are not marked as entirely new. So I am pretty sure our current implementation is accurate, but we should be absolutely sure.
Data: