Implement the background update model of semantic interaction

yalibian commented 7 years ago

Implement the underlying analytic models (right now, the basic text analysis method: only help change the weighting vector)

Maybe a more complex ML model after the basic one.

yalibian commented 7 years ago

entity weighting scheme

yalibian commented 7 years ago

Make a new JS file to store the entity weighting scheme.

model.js

The adaptation of the analytic models over time are intended to approximate the evolution of user interest and stages of analytical reasoning. When inferring analytical reasoning in the form of a weighting scheme, a choice can be made as to how to translate an interaction to a set of dimensions and weight values. is learning can happen through cleanly inverting the mathematical model, creating a new heuristic by which to learn weights or some combination of the two.

yalibian commented 7 years ago

ForceSPIRE uses a model for learning weights that is more flexible, and thus does not adhere strictly to the inverted model (a force-directed model, in this case). For example, while performing an observation-level interaction in ForceSPIRE results in an emphasis of the similar characteristics between documents moved closer, the remaining weights are equally reduced for normalization of the global weight. As such, there is no direct inversion of the force-directed model, but instead the model is used to calculate the set of characteristics that correspond to the similarity, and the amount of emphasis those characteristics get (i.e., the increase of the weight of those entities) is via a constant.

yalibian commented 7 years ago

Users also have the ability to pin documents to speciﬁc locations. ese documents serve as spatial landmarks, in that they persist at that location, and the force-directed model treats them as layout constraints, organizing the remaining documents around them. Additionally, pinning allows ForceSPIRE to distinguish between exploratory and expressive movements. Dragging a document near a pinned document will brieﬂy color both documents pink to alert the user of the expressive movement (if the user releases the document at this location). us, all other move- ments in the space are exploratory movements.

yalibian commented 7 years ago

There are several kinds of nodes, instead of only doc-nodes with two kinds of visual details(Icon level and document-level):

The search node: The document node:(ICON, DOCUMENT) Then entity node:

yalibian commented 7 years ago

With the model.js we get document source contents and initialize the value first. Or just call each functions, when interactions loaded.

yalibian commented 7 years ago

Make a closure of model function which has following d3-style APIs:

model.nodes() // return nodes or model.links() model.documentMovement() model.textHighlighting() model.searchTerms() model.annotation() model.undo() model.update(weight) // Updates the node mass M in each nodes, and Spring force K.

The model will help to represent the nodes similarity and mass which will store in nodes and links.

yalibian commented 7 years ago

The docs object we get from backend should includes several things: to help run effeciently when update document similarity:

docs.nodes // Arrays includes nodes(id, text, html, mass, entities()) docs.links // Arrays includes links(source, target, similarity, entities) docs.entities // Dictionary entities (name, weight, alias)

yalibian commented 7 years ago

Right now, it is based on similarity based on TF-IDF

But we will update to use Shared ENTITIES

yalibian commented 7 years ago

After the talk with Mai, and thousands of times reading of Semantic Interaction, I realized the difference between Entity Weight Vector and Document TF_IDF attributes. We need to use both things to calculate the cosine similarity.

yalibian commented 7 years ago

How to combine entity weighting vector with document TF-IDF attribute: Weighted sum model Min-Hash

yalibian commented 7 years ago

Count a group of overlapped documents instead of improving the weights again and again!

overlappedDocuments[{ node1.id, node2.id, node3.id, } {node4.id, node5.id} ]

yalibian commented 7 years ago

If we should normalize the TF_IDF of entities attributes.

yalibian commented 7 years ago

Use Soft cosine measure to use the "connect the dots" results. Soft cosine measure

yalibian / CrowdSPIRE

Implement the background update model of semantic interaction #7