nanos / FediFetcher

FediFetcher is a tool for Mastodon that automatically fetches missing replies and posts from other fediverse instances, and adds them to your own Mastodon instance.
https://blog.thms.uk/fedifetcher?utm_source=github
MIT License
297 stars 215 forks source link

Add the ability to pull likes in #63

Closed JoshuaHolme closed 11 months ago

JoshuaHolme commented 11 months ago

It’d be great to be able to see how many total likes a post has as well in addition to the comments. Not just what your instance has seen.

nanos commented 11 months ago

I wish this was possible, but unfortunately that is not, because mastodon has no api endpoint to update the counts of likes or boosts 😔

Teqed commented 11 months ago

I have a fork which does this, so I figured I'd share my thoughts:

When FediFetcher receives a Status dict, it's privy to the favourites_count and reblogs_count values, and usually comes across these values incidentally while fetching context URLs for a status.

Mastodon servers typically connect to a PostgreSQL database, mine is named mastodon_production, accessible on the local network over port 5432. I have a user capable of accessing the database and logging in with a password. There's a Python package psycopg which can be used to open connections, enter queries and commit changes.

Mastodon's tables are stored in the public schema, and the ones we're interested in are statuses and status_stats.

image

statuses stores the primary key of status ids and is indexed by uri, but doesn't store reblog or favourite counts.

image

status_stats has an auto-generated ID field, a foreign key pointing back to statuses, and the count information we're looking for. Rows only seem to populate normally in this table when a status receives a count of some sort.

When FediFetcher asks the home instance to lookup the context URL by a search_v2 resolve, it also becomes privy to your home server's personal ID for that status, which is what you need to insert into this table.

INSERT INTO public.status_stats
(status_id, reblogs_count, favourites_count, created_at, updated_at)
VALUES (%s, %s, %s, %s, %s);

The class I've written here handles most of this by receiving a status ID and the updated counts -- it also makes use of a newly created table as a substitution for history files, but that's a different tangent.

Since FediFetcher is also technically compatible with any sufficiently complete Mastodon API compatible software, it shouldn't be surprising to learn that each of these have their own database structures, and requires a different implementation here. An example for the Misskey / Foundkey / Calckey / Firefish / Iceshrimp family can be found here. Note the coercion of favourites_count into a reactions dict.

My main problems with this approach:

It seems rather common to run FediFetcher as a GitHub Action, and I only really imagine this being practical for someone running it locally -- probably as the same user running Mastodon itself. That's a narrower deployment environment than "anywhere that can reach the Mastodon API" and might reach a different audience.

There's plenty of other reasons this implementation is imperfect, but the way I see it, any possible advantages gained here could be better achieved by Mastodon updating these counts itself when a resolution happens for search_v2.

Until that happens, I would recommend checking out the great Firefox / Chrome browser extension, Substitoot, which allows you to "see up-to-date boost/favorite counts on posts", among other things, by fetching the statuses as you browse to them and updating the information in your browser. There's some obvious downsides to this approach as well (in particular, if you're fond of using mobile apps like FediLab) but it goes a long way to bridge the gap.

nanos commented 11 months ago

Thank you for your detailed write up!

Obviously if you have direct database access this is possible. But as you write that's quite definitely outside the scope of this project from my perspective 😊