pysal / libpysal

Core components of Python Spatial Analysis Library
http://pysal.org/libpysal
Other
256 stars 78 forks source link

Edit island neighbors values in spatial Wobject Pysal #303

Closed MYMahfouz closed 4 years ago

MYMahfouz commented 6 years ago

I am working on clustering a map with some islands in it. These islands have no neighbors in the w object and so I can't run the maxp clustering function. I get the command ''No initial solution found''. What I tried to do is to calculate the neighbors using the distance spatial weights, then substituting the island value in the congenital spatial weight. The new W object still causes the same problem I used to get before, I am not sure where the problem comes from. Here is the piece of code

''

shape=ps.open('file.dbf') shape=shape.by_col_array('Solar') shape=shape.astype(float) hape[np.isnan(shape)]=0 scaler=MinMaxScaler() shape1=scaler.fit_transform(shape) w=ps.weights.Queen.from_shapefile('file.shp') knn5=ps.knnW_from_shapefile('file.shp') aa=w.islands mtrx,idx=w.full() for indx,val in enumerate(aa):

    w.neighbors[aa[indx]]=[knn5.neighbors[aa[indx]][0]]
    w.neighbors[knn5.neighbors[aa[indx]][0]]=w.neighbors[knn5.neighbors[aa[indx]][0]]+[aa[indx]]

w1=ps.weights.weights.W(w.neighbors,id_order=w.id_order,ids=w.id_order) thr=0.1*sum(shape1[:,0]) np.random.seed(1234) r=ps.region.maxp.Maxp(w1,shape1,floor=thr,floor_variable=shape1[:,0],initial=4000)

Then I get the command ''No initial solution found''

'' Any help please?

ljwolf commented 6 years ago

I'd recommend using the methods in pysal.weights.Wsets.

for this case, it might make sense to use ps.weights.Wsets.w_union(knn, w) to merge the KNN and queen weights.

MYMahfouz commented 6 years ago

@ljwolf thanks for the suggestion but I already tried the union and still the same problem I also tried editing the weight matrix itself and then using it to create a new w object with ps.weights.util.full2W, but still the same problem I don't get where this comes from

ljwolf commented 6 years ago

When you do use w_union, are there still islands?

ljwolf commented 6 years ago

If you'd like to double, check that with scipy, you can also use:

import scipy.sparse.csgraph as cg

cg.connected_components(W.sparse)

that returns a tuple and a set of labels telling you which observation falls in which component.

MYMahfouz commented 6 years ago

Yes, I still get the islands. I believe this is because when I use the union, the islands ID get neighbors, but then the islands ID are not added to the neighbors set of the other polygons, as the islands are not the closest centroid to the other polygons.

ljwolf commented 6 years ago

There should be no islands with a KNN(1).

Can you check if there are islands in the KNN1?

ljwolf commented 6 years ago

If you'd like to symmetrize the KNN, you can use KNN.symmetrize(). That + the union should ensure mutual linkage.

MYMahfouz commented 6 years ago

Sorry for the confusion. The union shows no islands anymore but my problem with Maxp, is still the same. It shows ''No initial solution found''

MYMahfouz commented 6 years ago

These are the neighbors after union. 0: [16, 38, 24, 74, 88, 95], 1: [62, 79], 2: [32, 51, 21], 3: [35, 69, 39, 94], 4: [56, 77, 23], 5: [90, 75, 60, 63], 6: [99, 53], 7: [48, 66, 72, 58], 8: [82, 19, 36, 41, 30, 47], 9: [80, 50, 85, 44], 10: [67], 11: [62, 31], 12: [42, 68], 13: [64, 96, 36, 38, 57, 28], 14: [34, 84], 15: [48, 18, 72, 74, 59, 61], 16: [0, 83, 54, 71, 88, 92, 95], 17: [88, 43, 37], 18: [48, 70, 25, 58, 61, 15], 19: [36, 71, 8, 73, 57, 41], 20: [82, 87, 45, 30], 21: [2, 51, 68, 42], 22: [81, 61, 69, 70, 93], 23: [97, 98, 4, 56], 24: [0, 37, 88, 89, 74, 59], 25: [18, 70, 39, 86, 58], 26: [54, 90, 92, 60], 27: [76, 31], 28: [64, 36, 13, 47], 29: [81, 65, 50, 85], 30: [82, 20, 8, 41, 45], 31: [67, 11, 27, 76], 32: [2, 91], 33: [78], 34: [52, 14, 84], 35: [3, 86, 39], 36: [19, 8, 57, 28, 13, 47], 37: [80, 17, 24, 89], 38: [0, 96, 72, 57, 74, 13, 95], 39: [3, 35, 69, 70, 86, 25], 40: [99, 84, 6], 41: [19, 8, 73, 75, 30], 42: [12, 21, 68], 43: [88, 17, 83], 44: [80, 89, 50, 9, 93], 45: [20, 87, 91, 30], 46: [11], 47: [82, 36, 8, 28], 48: [18, 7, 72, 58, 15], 49: [63], 50: [81, 93, 85, 9, 44, 29], 51: [2, 68, 21], 52: [34, 84], 53: [99, 6, 55], 54: [16, 83, 26, 92], 55: [4, 53], 56: [97, 4, 23, 77], 57: [19, 36, 38, 71, 13, 95], 58: [48, 18, 7, 25], 59: [24, 89, 74, 61, 15], 60: [26, 90, 5], 61: [18, 70, 22, 89, 59, 93, 15], 62: [1, 11], 63: [49, 75, 5], 64: [96, 66, 28, 13], 65: [81, 29, 94], 66: [64, 96, 7, 72], 67: [10, 76, 31], 68: [51, 21, 42, 12], 69: [81, 3, 70, 39, 22, 94], 70: [18, 69, 22, 39, 25, 61], 71: [16, 19, 73, 92, 95, 57], 72: [96, 48, 66, 38, 7, 74, 15], 73: [19, 71, 41, 90, 75, 92], 74: [0, 38, 72, 24, 59, 15], 75: [5, 73, 90, 63, 41], 76: [67, 27, 31], 77: [56, 4], 78: [33], 79: [1], 80: [37, 89, 44, 9], 81: [65, 50, 29, 69, 22, 93, 94], 82: [20, 8, 30, 47], 83: [16, 54, 88, 43], 84: [34, 52, 40, 14], 85: [9, 50, 29], 86: [25, 35, 39], 87: [91, 20, 45], 88: [0, 16, 17, 83, 24, 43], 89: [80, 61, 37, 24, 59, 44, 93], 90: [5, 73, 26, 75, 92, 60], 91: [32, 45, 87], 92: [16, 26, 54, 71, 73, 90], 93: [81, 50, 22, 89, 44, 61], 94: [81, 65, 3, 69], 95: [0, 16, 38, 71, 57], 96: [64, 66, 38, 72, 13], 97: [56, 23], 98: [23], 99: [40, 53, 6]}

As you can see at ID_Variable 46 (which was the island) has ID_Variable 11 now as neighbor. But ID_Variable 11 doesn't have 46 as a neighbor. That's what I meant earlier when I said the union doesn't solve the issue.

ljwolf commented 6 years ago

Gotcha. Try KNN.symmetrize() to make sure that the KNN weights are symmetric.

MYMahfouz commented 6 years ago

I can't symmetrize I get an error 'KNN' object has no attribute 'symmetrize'

ljwolf commented 6 years ago

OK. sorry about that, might be development version only then.

try:

knn1 = ps.weights.KNN.from_shapefile('solar.shp')
queen = ps.weights.Queen.from_shapefile('solar.shp')

symm_knn = ps.weights.WSP((knn1.sparse + knn1.sparse.T)>0).to_W()

fully_connected = ps.weights.w_union(symm_knn, queen)
MYMahfouz commented 6 years ago

The neighbors now are fixed, but I still get the same output command from the Maxp "No initial solution found"

weikang9009 commented 6 years ago

Maybe try the new function attach_islands in libpysal? It will ensure the returned spatial weight symmetric and exempt from islands. It is available in the released version 3.0.6 of libpysal.

MYMahfouz commented 6 years ago

I just tried it it also didn't work, same Maxp problem.

I have a comment though, there is something missing in the example mentioned for attach_islands function. It should be like this:

import libpysal.api as ps from libpysal import weights w=ps.rook_from_shapefile(libpysal.examples.get_path('10740.shp')) w_knn1=ps.knnW_from_shapefile(libpysal.examples.get_path('10740.shp'),k=1) w_attach=weights.util.attach_islands(w, w_knn1)

MYMahfouz commented 6 years ago

Hi @ljwolf and @weikang9009 I totally removed the island polygon from the shape file and I still get the same error so I believe the W object is not the problem. What reasons could cause Maxp to give the command ''No initial solution''? I went through the code source and found that this command comes out when the MAX_ATTEMPTS are over, would increasing the attempts solve my problem? Maybe if I understand where the error comes from I can also manipulate my data in a way to avoid the error.

knaaptime commented 6 years ago

would it be possible to post a map of your dataset? Max-p will fail if there are disconnected components in the connectivity matrix.... It's not your W per se (there are no individual islands, as each observation has at least one neighbor), but it's possible that there are small groups of observations that touch each other but are separated from everything else.

so for instance, Max-p will fail in the example below because the two groups of census tracts circled near the bottom are disjoint from the larger region

image

MYMahfouz commented 6 years ago

@knaaptime Thanks a lot, that is my problem then. I didnt know that before. Is there an attribute in the W object to identify these isolated observation, without drawing the map?

knaaptime commented 6 years ago

Maybe we should leave this open?

In cases like this it would be useful if weights objects had a fully_connected attribute or similar. That way, e.g., maxp would have an easy check to raise a more informative error (and maybe an option to connect the graph for users who know what they're doing)

I think i had a more detailed conversation with @ljwolf about this but i cant seem to find it. I think spenc handles all of this internally?

MYMahfouz commented 6 years ago

@knaaptime I closed it because I figured out I have another problem pysal/libpysal#302 mentioned here which also causes the present of more disconnected attributes. And I did a simple code to get rid disconnected areas but it doesn't work on my big shapefile attached in the issue mentioned. It would be really great if I could get help on figuring out a solution for figuring out existing neighbor before separated areas, so any help/ideas would be great.

MYMahfouz commented 6 years ago

This is the piece of code I am using now to manipulate both the lack of neighbors because they simply reall don't exist or because they exist but pysal can't figure it out. It works fine but I would like any updates or ideas from your side @ljwolf @knaaptime


w = ps.weights.Rook.from_shapefile('./Output/Datavalid.shp')
[tt, A]=cg.connected_components(w.sparse)
aa=w.islands
if tt>1:
    for yy in range(0,tt):
        B,b=np.unique(A, return_counts=True)
        y=B[np.where(b==max(b))]
        if yy==y[0]:
            continue
        z=pd.DataFrame(columns=['z'])
        print(yy)
        for o in range(0,len(A)):
            if A[o]==yy:
                z.loc[o]=o
        if len(z)==1:
            knn1 = ps.weights.KNN.from_shapefile('./Output/Datavalid.shp',k=1)
            for indx,val in enumerate(aa):
                w.neighbors[aa[indx]]=[knn1.neighbors[aa[indx]][0]]+w.neighbors[aa[indx]]
                w.neighbors[knn1.neighbors[aa[indx]][0]]=w.neighbors[knn1.neighbors[aa[indx]][0]]+[aa[indx]]
        else:
            knn2 = ps.weights.KNN.from_shapefile('./Output/Datavalid.shp',k=len(z))
            bb=pd.DataFrame(columns=['bb'])
            for e in range(len(w.neighbors)):
                if len(w.neighbors[e])==(len(z)-1):
                    bb.loc[e]=e
            counter=0
            for cc in range(len(bb)-1):
                counter=counter+1
                for dd in range(counter,len(bb)):
                    if np.all(w.neighbors[bb.index[cc]]==bb.index[dd]):
                        w.neighbors[bb.index[cc]]=knn2.neighbors[bb.index[cc]]
                        w.neighbors[knn2.neighbors[bb.index[cc]][(len(z)-1)]]=w.neighbors[knn2.neighbors[bb.index[cc]][(len(z)-1)]]+[bb.index[cc]]      

The main problem with this now is comands like[tt, A]=cg.connected_components(w.sparse) , don't show the updates done in w.neighbors. same for w.islands .

MYMahfouz commented 6 years ago

This code totally solves the problem in all conditions @ljwolf @knaaptime please correct me if I am wrong.

import pysal as ps
import pandas as gpd
import scipy.sparse.csgraph as cg
import libpysal
import pandas as pd

result = gpd.read_file('result.shp')
w = ps.weights.Queen.from_shapefile("result.shp")
knn1 = ps.weights.KNN.from_shapefile("result.shp",k=1)
w = libpysal.weights.util.attach_islands(w, knn1)
[tt, A]=cg.connected_components(w.sparse)
for gg in range(tt):
    ss=[uu for uu, x in enumerate(A==gg) if x]
    dd=result.loc[ss]
    dd['F']=1
    dd=dd.dissolve(by='F')
    dd.index=[len(result)]
    Dissolve=result.drop(ss)
    Dissolve=Dissolve.append(dd)
    knn1 = ps.weights.KNN.from_dataframe(Dissolve,k=1)
    for cc in range(1,len(result)):
        countern=0
        knn = ps.weights.KNN.from_dataframe(result,k=cc)
        for s in range(len(ss)):
            if knn.neighbors[ss[s]][cc-1]==knn1.neighbors[len(result)][0]:
                w.neighbors[ss[s]]=w.neighbors[ss[s]]+knn1.neighbors[len(result)]
                w.neighbors[knn1.neighbors[len(result)][0]]=w.neighbors[knn1.neighbors[len(result)][0]]+[ss[s]]
                countern=countern+1
                continue
        if countern>0:
            break