This feature is something that I outlined in the project README.md and think its something we might want to develop sooner rather than later.
BBD or 'big brother database' really an idea completely borrowed (stolen) from emacs. BBD keeps track of users that it comes across and keeps information about them, in the emacs world it would be things that you could associate with an email address that you typically also learned from examining email messages -- who you conversate with, how often, and so on. I'm not sure how they structure that information internally, but we're going to probably build our version of BBD right into our property graph instead of reinventing the wheel. We're doing the relationship inference and network building already, but we are not collecting any additional information about this person.
Things that could be useful to know about a person:
[ ] Social Media links (twitter handle, github profile, etc)
[ ] E-mail addresses
With social media links you could extend the graph with new connections, or enhance existing vertices and edges with relevant information/weights.
Its entirely possible that we would never learn a users e-mail address from just chat logs alone, but there are several ways around that:
[ ] Users provide their own data that is enriched
[ ] Expert Systems
Expert Systems
Agentic workflows could reasonably help identify users especially users that have re-used usernames throughout different platforms. For instance, if the user 'mfreeman451' on a Discord server talks about hockey, and a web search also revealed this user having a login on hockeyforums.com, where we can view their profile and find an e-mail address, that would be a strong connection and link between that username and that e-mail address. More concrete examples would be GitHub usernames, twitter handles, and so on. Often times personas re-use usernames across the platform, it is their brand, their identity.
Who this is for
Often times the absence of information about a thing or person can reveal information about its legitimatacy or agency. For instance, you can accurately predict a domain is malicious if it does not have MX records, use ipv6, dnssec, etc according to researcher John Bambenek. In the same sense, you might be able to make the same prediction about a persona or actor.
This feature is something that I outlined in the project README.md and think its something we might want to develop sooner rather than later.
BBD or 'big brother database' really an idea completely borrowed (stolen) from emacs. BBD keeps track of users that it comes across and keeps information about them, in the emacs world it would be things that you could associate with an email address that you typically also learned from examining email messages -- who you conversate with, how often, and so on. I'm not sure how they structure that information internally, but we're going to probably build our version of BBD right into our property graph instead of reinventing the wheel. We're doing the relationship inference and network building already, but we are not collecting any additional information about this person.
https://www.emacswiki.org/emacs/BbdbMode
Things that could be useful to know about a person:
With social media links you could extend the graph with new connections, or enhance existing vertices and edges with relevant information/weights.
Its entirely possible that we would never learn a users e-mail address from just chat logs alone, but there are several ways around that:
Expert Systems
Agentic workflows could reasonably help identify users especially users that have re-used usernames throughout different platforms. For instance, if the user 'mfreeman451' on a Discord server talks about hockey, and a web search also revealed this user having a login on hockeyforums.com, where we can view their profile and find an e-mail address, that would be a strong connection and link between that username and that e-mail address. More concrete examples would be GitHub usernames, twitter handles, and so on. Often times personas re-use usernames across the platform, it is their brand, their identity.
Who this is for
Often times the absence of information about a thing or person can reveal information about its legitimatacy or agency. For instance, you can accurately predict a domain is malicious if it does not have MX records, use ipv6, dnssec, etc according to researcher John Bambenek. In the same sense, you might be able to make the same prediction about a persona or actor.