jkomoros / card-web

The web app behind thecompendium.cards
Apache License 2.0
53 stars 8 forks source link

Some way to create clusters of similar cards #572

Open jkomoros opened 2 years ago

jkomoros commented 2 years ago

Cluster cards into cliques based not just on explicit links but also high semantic overlap.

"K-means clustering"? Something like that?

Likely somewhat expensive, and likely needs some kind of tunable threshold.

Ideally an explicit configurable filter. Related to #570 in terms of use case

jkomoros commented 2 years ago

Ideally we'd find cards that both share a very rare bigram (but only once) and boost that. The normal fingerprints boost the overlapping explicit concepts, but I also want to find the two cards that both mention "meta plans"

jkomoros commented 2 years ago

The output of this would be perhaps a list where each clique was named by an identifier (maybe the ID of the most centroid card?) and then the result list is sorted by that, and we give a label of the cluster ID for each one, so there's a clear thing in the UI about which one is which.

And then you could have another filter, or a version of the filter, that takes a card-ID, and only returns the cluster that that specific card is in

jkomoros commented 2 years ago

this might require a force-graph-layout based on similarity to have a euclidean x,y coordinate then a graph partitioning