GuyAllard / markov_clustering

markov clustering in python
MIT License
168 stars 37 forks source link

how can i get the before node name? #8

Open hurun opened 6 years ago

hurun commented 6 years ago

when i use the below command,i want get the node name before clustering ,how i should program to get it. result = mc.run_mcl(matrix)
clusters = mc.get_clusters(result)

Moonire commented 6 years ago

Please specify your inputs and the output you desire, because i'm not sure what do you mean exactly.

Mataivic commented 6 years ago

I think I got the same issue ;

I have previously built a network with networkx, from a pandas adjacency table :

Matrice = pd.read_csv('adj_matrix.csv', sep=',', index_col=0, header=0)
network = nx.from_pandas_adjacency(df=Matrice, create_using=nx.Graph())

My labels are formatted as follow : 'OTUxx' , where xx is a number from 1 to several digits.

      | OTU85 | OTU58 | OTU95 | ...
______________________________________
OTU85 | 
OTU58 |          the adjacency table
OTU95 | 
...

When I apply the MCL algorithm and draw the clustered graph (with random positions) with nodes labels :

matrix = nx.to_scipy_sparse_matrix(network)
result = mc.run_mcl(matrix)
clusters = mc.get_clusters(result)
positions_random = {i:(random.random() * 2 - 1, random.random() * 2 - 1) for i in range(network.number_of_nodes())}
mc.draw_graph(matrix, clusters, pos=positions_random, node_size=25, with_labels=True, edge_color="silver")

The labels aren't the same anymore : nodes are only labelled by numbers, starting at 0. I don't know if the modification happens in nx.to_scipy_parse(), the mc.run_mc(), or the mc.get_clusters() commands.

How can I make a correspondance between my initial labels and the final labels ?

Moonire commented 6 years ago

I see what the problems is thank you @Mataivic . I'll look into it in the coming days. This issue will stay open meanwhile.

Feel free to submit a trick to avoid the problem if you find on, or a PR if you can add it as an option.

GuyAllard commented 6 years ago

The scipy matrices that are used by the library don't support row and column names, so that information doesn't get used. The order of row and columns is unchanged, so the indexes of all nodes is consistent. I think that the issue can be solved by allowing a list of node labels to be passed to mc.draw_graph.

hkarakurt commented 4 years ago

Hello, Is that issue solved? I also need node labels or names instead of indices in the results. I cannot draw graph due to my network is really big (~17000 nodes).