hpi-epic / gpucsl

Constraint-based Causal Structure Learning on GPUs.
MIT License
38 stars 1 forks source link

node are not labeled in output directed_graph #19

Closed dalhoomist closed 1 year ago

dalhoomist commented 1 year ago

Hello,

Thank you for the putting together this package. I am testing it out and it is very fast.

My question is about the data input. I tried inputting a dataframe from panda and the algorithm runs fine with the GaussianPC function. However, I get a directed_graph with no node names. The output of directed_graph.nodes() is a series of numbers equal to the number of culumns in my input. Do the column names in the input correspond directly with the node numbers in the directed_graph?

Thank you

ChristopherSchmidt89 commented 1 year ago

You are right, we omit node names, and use the columns' indexes (zero-indexed) instead. Hence, the node numbers in the directed_graph correspond to the column names in the order provided as data input.

dalhoomist commented 1 year ago

Thank you for the clarification

dalhoomist commented 1 year ago

Another question. The resultant network from running GaussianPC is a directed graph. but even though I provide a correlation matrix as part of the input, it is unweighted. Is there a way to generate a weighted graph? Or is acceptable to assign the correlation values as weights manually? Finally, is there a way to export the graph to be used as a weighted directed graph that can be edited/previewed/analyzed in cytoscape?

ChristopherSchmidt89 commented 1 year ago

The package does not support generating a weighted graph. It only returns the estimated CPDAG, similar to the R pcalg package.

The directed graph is of type DiGraph from the networkx package. Hence, you could add the weights that you require manually. Also, networkx supports transformation of the graph data into a format for cytoscape, e.g., see function cytoscape_data of networkx (see here).

dalhoomist commented 1 year ago

Thanks very much for the information and guidance.