Mango-information-systems / twitto_be

twitto_be is a real-time tweets analytics dashboard
https://twitto.be
Other
12 stars 2 forks source link

tags / mentions graph + realtime effect #147

Closed mef closed 4 years ago

mef commented 5 years ago

Experiment with the implementation of the following solution:

mef commented 5 years ago

hashtags + mentions graph

Design notes - work in progress.

graphology may be used to manage the graph data.

ETL changes (tweetStream)

When receiving a tweet from the API:

subgraph generation

The subgraph to be sent to the client must be limited to the n top nodes (perhaps using their HITS authority measure). This subgraph does not need to be maintained in real time, perhaps one update per minute is relevant.

Todo: validate that HITS can be executed with a decent performance on the full graph. Alternative: use node degree...

Update 2020-01-07: top n nodes are determined using degree measure. HITS is not needed, since the graph is undirected.

data expiration

Nodes must have a last update timestamp attribute, to be updated each time mergeNode is called. This attribute is to be used in order to clear records older than 24h. Good news is that dropNode also removes "all its attached edges from the graph", so it won't be necessary to iterate to the edges, in addition to the nodes.

Update 2020-01-07: last update timestamp has been removed from node attributes. Cleanup is done based on the timestamp stored in the tweets array, this allows to decrement edges weights in addition to updating node counts.

mef commented 4 years ago

Remains to do for the graph display:

<button onclick="document.querySelector('#wikiArticle').mozRequestFullScreen()">Toggle</button>
mef commented 4 years ago

implemented in v4.0.0