twitter / the-algorithm

Source code for Twitter's Recommendation Algorithm
https://blog.twitter.com/engineering/en_us/topics/open-source/2023/twitter-recommendation-algorithm
GNU Affero General Public License v3.0
62.31k stars 12.15k forks source link

How to use Twitter follower graph to identify subject matter experts topically - From the perspective of a PhD/Data Scientist #1861

Open nathan-oplus opened 1 year ago

nathan-oplus commented 1 year ago

The Problem. As a Twitter user, I am very interested in what thought leaders have to say, especially about their given area of expertise. I also recognize that Twitter is unique among all social network because it seems all the world experts flock to the platform to argue about current events. I would like a version of the Twitter algorithm that prioritizes comments from these subject matter experts.

But how can you algorithmically define an expert?

Proposed formulation

I propose that the thing that will make the accounts of thought leaders exceptional is that they will be engaged by other experts more often. The big names in Machine Learning for example, (think Andrew Ng, Yann LeCun) only follow a small number of other accounts. Those accounts are most likely some of the best ML accounts to follow. It seems to me this will be true of many many important subject of today.

If you take any news item, lets say the recent submersible implosion, it would be really interesting to see what the materials experts have to say about it, what the submersibles experts have to say etc etc.

If you looked at the follower graph, and you assigned a weight to each edge that was the ratio of followers to follows the account has. For example, @AndrewyNG follows this account: @JackK. @AndrewyNg has almost 900k followers and is only following 700 people so this follow is very meaningful. Let's make an equation:

w_e = n_followers/n_follows

w_e is the weight of the edge, n_followers is the number of followers of @andrewYNG in this example ~900k, n_follows is the number of follows of @andrewYNG.

If you then snip any edge that has w_e < threshold value (some number >1 lets say 5 or 10), you'll get a new follower graph. This graph should represent a map of the meaningful twitter voices.

You might then do some clustering in the graph, maybe make some vector embeddings of the posts by these accounts, and it's not a stretch of the imagination to think you'd be able to come up with topical clusters of the accounts that matter for any subject that people talk about.

How powerful would it be to take any given topic and get a stream of the biggest voices in that area automatically?

Isn't this what everyone is trying to do with their follows anyway??

Why I say this is democratic

The beauty of this method is that it finds experts by the number of people that have followed that account. These are the votes of the people. People listen to Andrew Ng, this is why he has followers. This is not arbitrary, and there was no man behind the curtain that needed to deem Andrew as the voice of truth. Therefore, assigning more weight to his interactions/follows makes sense because the people have given him that weight with their voice and follows. This could effectively turn twitter into a self regulating thought ecosystem able to find authoritative opinions from thought leaders in real time as events unfold.

ParkerRex commented 1 year ago

:/

nathan-oplus commented 1 year ago

Do you disagree with something in particular?

TolstoyDotCom commented 1 year ago

Your authoritarian mindset is very appealing to Twitter employees: they think it should be a site with sheeple passively consuming the great thoughts of "experts".

Don't just take it from me, here's a tweet that Twitter provided as a suggested response to things you don't understand: "This data doesn't seem to come from a reputable source so I'd suggest further research. If there are any experts please send recommendations on where to go to learn more. I'll keep this thread updated with any great articles I find" [1]. A lot of people can see what's wrong with that quote, but Twitter can't.

As for your specific proposal, if we wanted to get the scoop on politics, we could start with Philip Bump of the WaPo. His network includes some of the top reporters around from the NYT, WaPo, LAT, AP, etc. Twitter (and presumably you) think that's all you need. Except, Bump and the rest constantly deceive by omission about countless subjects.

Some might suggest also adding in Fox News reporters to get a more balanced view, but Fox has the same issue. In fact, on various topics, both they and the NYT deceive by omission about the same things.

In any contentious field, relying on a network of experts is very comforting for those like Twitter employees, but it leads to ignoring all the things the experts don't want people to know.

ADDED: I'll give an example of how those like @nathan-oplus literally killed people (not him personally, just those with his mindset).

California gov Gavin Newsom decided which groups got doses in which order based on politics; he was quite open about vaxing by "equity". Vaxing by politics is the opposite of "following the science". Newsom let millions at low ICU risk get doses months before millions at med/hi ICU risk. I'm in the last group and I drove to AZ for my doses; I got them weeks or months before I would have got them from Newsom.

Using CDC stats, it's easy to show that Newsom needlessly increased deaths (details on request).

When Newsom first released his vax tiers ("My Turn CA") I set up a website in opposition. I tried to buy ads promoting the site but was refused due to the site being about C19. I started a petition at change dot org; they deleted it and refused to respond to my emails and calls. Meanwhile, Dove and other profiteers were given free rein to make bank off the pandemic.

And, Newsom has never been held to account for killing people. GOP/Fox won't mention it because they have to pander to anti-mask/anti-vax wackos. The Dems/media/NGOs/etc won't mention it because Newsom is on their team.

[1] that was provided in an image on Twitter's site; I transcribed it and saved it but a search doesn't bring up any results so if anyone knows the URL please post it.

nathan-oplus commented 1 year ago

You assume a lot about my world view, but I was reading the science at the time of the Covid 19 vaccine and was deeply troubled by the way it was handled. This system would be very uncomfortable to the Twitter 1.0 team for one simple reason. It removed the power to arbitrate the truth from their hands. There is no man behind the curtain being the Orwellian thought police. It is the users themselves that assign weight to different voices.

For example Dr. Jay Bhattacharya was censored and demoted from expert to taboo due to his contrarian views on Covid. This would not have happened under the proposed algorithm. Only if this professor was unfollowed by the other experts who respect him and the masses would his voice be censored.

You seem to propose that there is no such thing as a thought leader or a subject matter expert: that every voice should bear equal weight on every matter. Else everyone is just being Sheeple and not thinking for themselves. I strongly disagree. Thought leaders are not the problem. It's the act of finding these leaders that is the problem. Having a small number of people deciding who is and who is not a leader is the problem. This is why democracies are much better than authoritarian governmental systems: the masses vote in leaders. Forcing every Twitter user to sift through the thoughts of a bunch of random accounts and form their own deeply researched opinion on every. single. issue. is not a solution. That would be akin to removing all government leaders and letting the people figure out every issue on their own instead of voting in trusted voices to steer the country.

The Twitter algorithm should be exceedingly hard to game. It is hard to create a large account with millions of followers. It's harder still to have many large accounts following yours. The people who put in the work to become a notable and followed voice should be given more space on the platform to share their ideas than the random accounts that found their opinion on the subject 2 seconds ago. Nobody cares about their opinion. I want to be able to quickly ingest the most informed perspectives on a subject and then make my decision. I also want to know that the supposed experts are not being cherry picked by a bunch of activists from one political extreme. This is because I don't particularly like the feeling of being indoctrinated against my will.

TolstoyDotCom commented 1 year ago

Popular != best. In fact, I wrote a Drupal module about that.

Sports is a meritocracy, but most other fields are based on popular opinion and that opinion is frequently wrong. [long discussion of pop music sirens, pundits, etc self-expunged].

Lysenko would have had a wide social graph back in the day.

nathan-oplus commented 1 year ago

I agree! It's possible for accounts to be large and to be wrong. Your voting scheme seems interesting but it will not solve the problem in real time.

Twitter is fundamentally a real time conversation about current events. We need a way to make that conversation more informed, more relevant, and maybe even more authorotative.

The problem we have is that large popular accounts can be wrong about a lot of things, misleading the public, but we have no authoritative way to tell the difference between wrong opinion and contrarian opinion. Or even worse between popular opinion and wrong opinion. All we have is debate as a tool. My solution improves this debate in two main ways:

  1. Large account follows discover less popular but still important voices. Accounts don't have to be popular, they can be followed by large accounts giving them more weight in the graph. This allows subject matter experts that might be followed by the big names but not big names themselves to weigh in.

  2. Finding subject matter experts on a given topic. Clustering thought leaders into subject matter expertise using NLP will allow us to find the most important accounts that are thought leaders on that subject (i can expound on how this could be done if anyone is interested). These voices should be heard about their subject of expertise even if they are wrong. Let's say some respected economist calls a recession and is wrong. We still want that opinion to be shared because it is important in and of itself even if it is wrong, but let's say that subject matter expert predicts that 5 people are going to survive a catastrophic submersible failure: no one should care.

nathan-oplus commented 1 year ago

P.S. To your point about Lysenko, this is exactly the type of problem the platform that Twitter has could mitigate. If Lysenko was alive now and releasing this information and let's imagine he was highly popular. There's no way the expert community of scientist that study agriculture would not be trying to reproduce his work and finding red flags. Ideally the Twitter algorithm would automatically find the counter arguments from the most informed accounts and surface them. In this hypothetical situation there would also be millions upon millions of other accounts taking one side of the debate or another based on how it made them feel or what made sense to them. These accounts would be following the popular accounts more so on which side of the issue they fall on. The popular accounts would then be following their less popular colleagues that do the research. This group of popular accounts that tweet about Lysenko's research and their colleagues should fairly dominate the discussion because the masses of Twitter have elected them so with their votes.

The worst case scenario is that all popular voices agree on a subject and the only people who see it differently are unpopular. IE some conspiracy theories are true. The thing is though, with most validated conspiracy theories there WERE informed and potentially popular voices that got silenced by the consensus.

TolstoyDotCom commented 1 year ago

My OpenQuestions plan would in fact help with determining the "best" voices and "best" plans. The problem with the implementation is that Twitter is a completely depraved and immoral company from top to bottom as I've shown here repeatedly. They lie to millions of people each day in myriad ways. I can't think of a U.S. company that's less trustworthy.

If Twitter had even a small degree of integrity, then an expert who pushes a specific plan could be shamed/goaded into putting it to the test. Choose others to ask him questions. If Twitter refuses to pick someone as a questioner, that will be obvious and can be used to make Twitter look bad. Then, the questions and the answers can be commented on. All transparent. Such tests would become part of a tweeter's permanent record. If they do well in such tests, people will be more inclined to trust them and vice versa.

Note also that I'm a subject matter expert in a certain field. Yet, I get almost zero help with that because I don't tell people what they want to hear and I attack all sides when they're wrong. Some expert who became popular by telling people what they want to hear and by picking a side isn't going to recommend someone who's repeatedly shown them wrong.

For a real world example, several years ago a politician proposed a loony plan. I accurately predicted exactly how it would fail. The politician's fans all thought the plan would succeed. His opponents all took the plan seriously; CNN even made an animation showing how the plan would work. I and someone else (later) were the only ones who said the plan wasn't feasible. By pushing a loony plan that was destined to fail, the politician did incredible damage to the USA and conned millions of people (in many cases out of money). I got banned from sites that supported that politician for pointing out how his plan would fail. Because I'm heavily censored on Twitter I couldn't use that to get the word out. So, for this reason and thousands more, I'm not a fan of authoritarians who try to influence debate.

nathan-oplus commented 1 year ago

Do you have any retort for my explanation of the idea I have presented? This is not the right forum to debate this voting system you have proposed.

Lisag123 commented 1 year ago
  1. This would not work. There are many large accounts who are not experts in anything. Just because someone has a lot of followers does not mean they are subject matter experts on anything besides gaining followers. YouTube allows doctors and those with professional degrees to be verified. That would increase expert voices. There are many normal doctors who talk about various diseases and have less than 1,000 followers- I want to hear their opinion more than someone who posts cat pictures. I want to hear financial information from someone with a CFA or CPA than a random person for instance. The solution could be verification of professional degrees. There could even be a lower tier of people who are working towards specific professional degrees to speak about different subjects.
TolstoyDotCom commented 1 year ago

Managed funds - Cathie Wood, Paffrath, etc - are a good clue whether someone is good at picking stocks. Wood is limited in what she can buy so that has to be factored in. There have been inconclusive "studies" about Jim Cramer's picks. But, generally, finance is like sports in that it's a meritocracy. That's unlike easily gamed things like thinking who has the most followers means they're better.

Lisag123 commented 1 year ago

That would be a good way to see finance. For health stuff I would rather see doctor's opinions. When I search "heart disease" the majority posts are people with large followings and no credentials. I tried to change this by following more doctors and it didn't change anything. I tried following doctors with specific specialties and even if they tweet a lot they are not on my feed. I don't want health advice from influencers personally. I may want to see alternative views but not as the dominant talking point. If I search "mental health" the results are all influencers again. My account is messed up I have been search suggestion banned and search banned for a few months and support does not reply so it could just be my account.

nathan-oplus commented 1 year ago
  1. This would not work. There are many large accounts who are not experts in anything. Just because someone has a lot of followers does not mean they are subject matter experts on anything besides gaining followers. YouTube allows doctors and those with professional degrees to be verified. That would increase expert voices. There are many normal doctors who talk about various diseases and have less than 1,000 followers- I want to hear their opinion more than someone who posts cat pictures. I want to hear financial information from someone with a CFA or CPA than a random person for instance. The solution could be verification of professional degrees. There could even be a lower tier of people who are working towards specific professional degrees to speak about different subjects.

The problem with manual verification processes is that it is corruptible. It's unacceptable to allow a few people within a few companies to be the arbiters of who is "expert" and who is not. I see your point that (for example, in the case of healthcare) there are non-doctor voices that are popular and that these voices might be amplified even if they are mistaken. That is okay. Even if they are wrong it is okay. The community notes feature is a good check and balance for this scenario. But it is also true that for many many fields the actual world experts are on Twitter talking about their research. We need some way to identify these voices in a way that doesn't create an opportunity for censorship if someone at the company or government decides they want to silence a set of inconvenient opinions.

Lisag123 commented 1 year ago

Yeah, I would like to see a researcher's opinion about subjects even when they have low following numbers. There are a lot of experts with a low number of followers with really good information. I think there could be a topic for people who have PhD's for instance and then broken down into category of expertise. There's such a low number of people with PhD's that the lower following accounts would get more reach. The posts with community notes showing information as false should have the posts de amplified. This doesn't seem to happen though. The note is there so people know it is false but anyone with 500,000 followers has more reach no matter what they say. There are only 700 twitter users with over 1 million followers. Their social proof is they have high followings not high information. Many people above 100,000 followers are also bot accounts and a considerable amount on the platform get paid to post their opinions. So reach based on followers is not the best way to gauge expertise imo.

Lisag123 commented 1 year ago

There is definitely a way to change this, but I haven't thought of what it is yet. A crypto account with 130,000 followers and following 200 people usually gets paid to promote someone else's brand and follow the people that pay them to boost those accounts. So it can be deceptive to think the people with big accounts are following others in a good faith way.