carverauto / threadr

🌎 OSS Real-time AI Data Analysis with GraphDB integration. 🔍
Apache License 2.0
17 stars 1 forks source link

bbd: Big Brother Database #53

Open mfreeman451 opened 7 months ago

mfreeman451 commented 7 months ago

This feature is something that I outlined in the project README.md and think its something we might want to develop sooner rather than later.

BBD or 'big brother database' really an idea completely borrowed (stolen) from emacs. BBD keeps track of users that it comes across and keeps information about them, in the emacs world it would be things that you could associate with an email address that you typically also learned from examining email messages -- who you conversate with, how often, and so on. I'm not sure how they structure that information internally, but we're going to probably build our version of BBD right into our property graph instead of reinventing the wheel. We're doing the relationship inference and network building already, but we are not collecting any additional information about this person.

https://www.emacswiki.org/emacs/BbdbMode

Things that could be useful to know about a person:

With social media links you could extend the graph with new connections, or enhance existing vertices and edges with relevant information/weights.

Its entirely possible that we would never learn a users e-mail address from just chat logs alone, but there are several ways around that:

Expert Systems

Agentic workflows could reasonably help identify users especially users that have re-used usernames throughout different platforms. For instance, if the user 'mfreeman451' on a Discord server talks about hockey, and a web search also revealed this user having a login on hockeyforums.com, where we can view their profile and find an e-mail address, that would be a strong connection and link between that username and that e-mail address. More concrete examples would be GitHub usernames, twitter handles, and so on. Often times personas re-use usernames across the platform, it is their brand, their identity.

Who this is for

Often times the absence of information about a thing or person can reveal information about its legitimatacy or agency. For instance, you can accurately predict a domain is malicious if it does not have MX records, use ipv6, dnssec, etc according to researcher John Bambenek. In the same sense, you might be able to make the same prediction about a persona or actor.