tgxn / lemmy-explorer

Instance and Community Explorer for Lemmy
https://lemmyverse.net/
112 stars 9 forks source link

Sorting/Scoring System For Instances #77

Closed tgxn closed 1 year ago

tgxn commented 1 year ago

Discussed in https://github.com/tgxn/lemmy-explorer/discussions/23

Originally posted by **tgxn** June 14, 2023 Because we need to determine if an instance is "good" there needs to be a way to score each instance based on data we have about it. Currently, my thinking/implementation looks at the lists of federated sites, and scores each instance based on the amount of other instances that refer to it _(in the linked, allowed and blocked lists)_. Scoring is applied by the following rules: ## Instances ```js let score = 0; if (linkedFederation[siteBaseUrl]) { score += linkedFederation[siteBaseUrl]; } if (allowedFederation[siteBaseUrl]) { score += allowedFederation[siteBaseUrl] * 2; } if (blockedFederation[siteBaseUrl]) { score -= blockedFederation[siteBaseUrl] * 10; } ``` ## Communities Uses the same base score as instances, and then adjusts based on a posts per subscriber metric. ```js let score = 0; if (linkedFederation[siteBaseUrl]) { score += linkedFederation[siteBaseUrl]; } if (allowedFederation[siteBaseUrl]) { score += allowedFederation[siteBaseUrl] * 2; } if (blockedFederation[siteBaseUrl]) { score -= blockedFederation[siteBaseUrl] * 10; } // also score based subscribers score = score * community.counts.subscribers; ``` These rules are obviously not ideal, as I'd need to run some more analysis to determine if they are tuned correctly. I'm also thinking that it might be worthwhile to log an "uptime" or "first seen" score also to determine if it's been around/up for a while.
jfryton commented 1 year ago

I think active users per week would be the best default sorting. This avoids over-emphasizing communities that might be older but not as active.

tgxn commented 1 year ago

Upgraded scoring is deployed, also added additional sorting options https://github.com/tgxn/lemmy-explorer/pull/143