maxitg / SetReplace

C++/Wolfram Language package for exploring set and graph rewriting systems
MIT License
217 stars 44 forks source link

WolframModelFeatureExtractor #459

Open maxitg opened 3 years ago

maxitg commented 3 years ago

The problem

We need to understand how to use machine learning to classify, find anomalies, etc. for Wolfram model rules and evolutions. Ultimately, we want to be able to feed Wolfram model states/evolutions to neural nets, so we probably need to look into graph neural networks.

But first, we can implement a simpler FeatureExtractor based on graph properties such as average vertex degree, etc.

Given that some features could be computationally expensive, it would make sense to have multiple FeatureExtractorss with different computational complexity. (If we want to do universe filtering, for example, we can then use them in multiple steps.)

Possible solution

It seems the best approach would be to create a separate function (which should go to the Kernel directory).

Perhaps it would have DownValues with the default parameters (e.g., PerformanceGoal) so that one can pass it directly into, say, Predict.

It would be useful for it to also have an operator form so that one can use it as

Predict[..., WolframModelFeatureExtractor[opts]]
mrektor commented 3 years ago

This is a proof of concept of how one could think about implementing such an extractor, it is very simple:

simpleLabeledExtractor[g_Graph] :=
 <|
  "VertexCount" -> VertexCount[g],
  "EdgeCount" -> EdgeCount[g],
  "Radius" -> GraphRadius[g],
  "Diameter" -> GraphDiameter[g],
  "VertexConnectivity" -> VertexConnectivity[g],
  "EdgeConnectivity" -> EdgeConnectivity[g],
  "VertexDegreesQuantiles" ->  Quantile[VertexDegree[g], {0, 0.25, 0.50, 0.75, 1}]
  |>
simpleExtractor[g_Graph] := Flatten@Values[simpleLabeledExtractor[g]]

I like the idea of Associations because we can then in the future always easily go back to the feature meaning if we needed to.

From this form, we can then decide to add more or less expensive algorithms (Note that already Radius and Diameter might be expensive to compute for large graphs, even though approximations exists)

maxitg commented 3 years ago

@mrektor, keep in mind, Wolfram model states in general are hypergraphs, not graphs, so we need hypergraph versions of these functions.

We can also do this for causal graphs/expressions-events graphs instead.