Closed kbastani closed 9 years ago
This feature depends on completion of this project: https://github.com/kbastani/neo4j-mazerunner
No longer relevant. See https://github.com/Graphify/graphify/issues/19 for the new milestone functional specification.
The classification accuracy in the 1.0.0 build maxes out at 70% accuracy for sentiment analysis on movies reviews in the Cornell dataset.
The following feature enhancement is proposed for increasing the accuracy to over 75%.
Add a
HAS_AFFINITY
relationship to the Neo4j property graph between Pattern nodes.The
weight
property is incremented each time two patterns are matched within the same input.Using this new data model it is possible to run a PageRank calculation on the subgraph of features/patterns matched on an input.
When extracting features from the following input:
The last word in a sentence is interesting
The following JSON map describes the
frequency
(number of matches on the input),variance
(statistical variance of distribution to all training labels), andaffinity
(the result of PageRank on affinity relationships in the subgraph).