Open ljwolf opened 9 months ago
What about a keyword in symmetrize controlling if the symmetrization happens using addition or removal of asymmetric edges? It can be then exposed in the KNN builder via a keyword (feels better than another builder).
That's a good idea imho!
😆 i have an example in my new graph materials
(note we still dont have symmetrize in the Graph yet)
We probably want to think about the API, then... I would prefer something like the following, implemented in the utils.py
module, amenable for any weight type:
def make_symmetric(w, drop=False):
if drop:
# keep only mutual neighbors
...
else:
# add all reversed links to edge set
...
Also, people often do (W+W.T)/2
to symmetrize weight values (in either inclusive or exclusive cases). This is not possible to induce via a standardisation trick, and idk how we might want to implement that...?
Are we talking about implementing it in weights or graph? If the latter, then I'd like to have it as a method.
definitely graph. I'd like to avoid implementing things in weights.
@knaaptime a complete out of topic but since I've noticed it in your code above. Getting focals out of the Graph via neighbors
is going to be super inefficient. Especially compared to .unique_ids
which gives you nearly the same thing. g_knn10.unique_ids.to_series()
gives you the same output but on a graph I have loaded in memory right now with 13890804 edges, the neighbors
path takes 20.1s while unique_ids
186ms.
lol i noticed that right after i posted that photo and updated it to unique ids. thanks for the close eye!
For a different causal inference project, I'm writing a few spatial/feature matching algorithms.
I think we may want to offer a
Mutual_KNN()
constructor inGraph
, and also bring aSymmetric_KNN()
? or havesymmetric/mutual
options in a knn constructor?This is like the current
.symmetrize()
function, but instead of adding edges to theKNN
graph to induce symmetry, it removes edges to theKNN
graph who are not mutually k-near.This could also be implemented as a separate function for arbitrary graphs after weighting/kerneling, since it's just based on the edge set: