Closed yalibian closed 7 years ago
entity weighting scheme
Make a new JS file to store the entity weighting scheme.
model.js
The adaptation of the analytic models over time are intended to approximate the evolution of user interest and stages of analytical reasoning. When inferring analytical reasoning in the form of a weighting scheme, a choice can be made as to how to translate an interaction to a set of dimensions and weight values. is learning can happen through cleanly inverting the mathematical model, creating a new heuristic by which to learn weights or some combination of the two.
ForceSPIRE uses a model for learning weights that is more flexible, and thus does not adhere strictly to the inverted model (a force-directed model, in this case). For example, while performing an observation-level interaction in ForceSPIRE results in an emphasis of the similar characteristics between documents moved closer, the remaining weights are equally reduced for normalization of the global weight. As such, there is no direct inversion of the force-directed model, but instead the model is used to calculate the set of characteristics that correspond to the similarity, and the amount of emphasis those characteristics get (i.e., the increase of the weight of those entities) is via a constant.
Users also have the ability to pin documents to specific locations. ese documents serve as spatial landmarks, in that they persist at that location, and the force-directed model treats them as layout constraints, organizing the remaining documents around them. Additionally, pinning allows ForceSPIRE to distinguish between exploratory and expressive movements. Dragging a document near a pinned document will briefly color both documents pink to alert the user of the expressive movement (if the user releases the document at this location). us, all other move- ments in the space are exploratory movements.
There are several kinds of nodes, instead of only doc-nodes with two kinds of visual details(Icon level and document-level):
The search node: The document node:(ICON, DOCUMENT) Then entity node:
With the model.js we get document source contents and initialize the value first. Or just call each functions, when interactions loaded.
Make a closure of model function which has following d3-style APIs:
model.nodes() // return nodes or model.links() model.documentMovement() model.textHighlighting() model.searchTerms() model.annotation() model.undo() model.update(weight) // Updates the node mass M in each nodes, and Spring force K.
The model will help to represent the nodes similarity and mass which will store in nodes and links.
The docs object we get from backend should includes several things: to help run effeciently when update document similarity:
docs.nodes // Arrays includes nodes(id, text, html, mass, entities()) docs.links // Arrays includes links(source, target, similarity, entities) docs.entities // Dictionary entities (name, weight, alias)
Right now, it is based on similarity based on TF-IDF
But we will update to use Shared ENTITIES
After the talk with Mai, and thousands of times reading of Semantic Interaction, I realized the difference between Entity Weight Vector and Document TF_IDF attributes. We need to use both things to calculate the cosine similarity.
How to combine entity weighting vector with document TF-IDF attribute: Weighted sum model Min-Hash
Count a group of overlapped documents instead of improving the weights again and again!
overlappedDocuments[{ node1.id, node2.id, node3.id, } {node4.id, node5.id} ]
If we should normalize the TF_IDF of entities attributes.
Use Soft cosine measure to use the "connect the dots" results. Soft cosine measure
Implement the underlying analytic models (right now, the basic text analysis method: only help change the weighting vector)
Maybe a more complex ML model after the basic one.