Closed luceos closed 3 years ago
This PR: https://github.com/flarum/tags/pull/84 should already reduce strain substantially. From there:
/api/tags
the first time that the Tags page is visited. After migrating tags from the old forum to Flarum, I wondered why the forum was loading indefinitely. Well, we have the answer - I have 2000+ tags 😄
I like the first solution:
What needs to be done is:
- Load only those tags on page load that are actually shown on the sidebar
- Load the additional tags on request
- Include the tags in the discussion list request
@rafaucau we're now investigating another possible cause: by any chance, do you have the fof follow tags extension installed?
@askvortsov1 Yes
Then that's likely it: an issue with FoF follow tags means that multiple SQL queries are made per tag whenever loading a page. I'm gonna close this for now unless someone can reproduce this on an install with only core extensions.
@askvortsov1 After disabling FoF follow-tags I don't see any difference. It still loads very slowly
Nevermind then... Could you open a new issue and share the output of your php flarum info
and link to this one
The problem is generated in TagSerializer by:
'canStartDiscussion' => $this->actor->can('startDiscussion', $tag),
'canAddToDiscussion' => $this->actor->can('addToDiscussion', $tag),
and
protected function lastPostedDiscussion($tag)
{
return $this->hasOne($tag, DiscussionSerializer::class);
}
@simon1222 is this something you've confirmed via testing? We were able to determine that disabling FoF Follow Tags reduces the number of SQL queries sent to the database on initial load from 411 to 54 for a forum with hundreds of tags.
@askvortsov1 yes, I've confimed via testing. I have 320k discussions and 2k tags, so lastPostedDiscussion was one of the problem. I was waiting about 60s, so I've commented return $this->hasOne($tag, DiscussionSerializer::class);
and reduced to 30s.
I was testing line by line and this piece of code is working fine:
$event->data['tags'] = Tag::whereVisibleTo($event->actor)
->withStateFor($event->actor)
->with(['parent', 'lastPostedDiscussion'])
->get();
but methods, which are executed in Serializer, are working too slow. I think these methods generate too many queries to the database.
hmm, thing is, that should be processed by the includes system, which would load it all in one query. It would be super helpful to get a bit more information as to what is going on with your forum, so we can try to fix some of these bugs and increase performance.
Would you be able to use https://github.com/FriendsOfFlarum/clockwork (if it works) or some alternate tool to see how many and which database requests are being made on forum load? Also, could you share the URL of your forum and the output of php flarum info
? If this is something you'd prefer not to post publically, could you send your https://discuss.flarum.org username and I can reach out to you via private message?
I think big performance gains will be achieved by not loading the lastPostedDiscussion relationship on normal pages. I might be wrong, but I think it's only used on the tags overview page. Here's the relevant code:
/**
* @param WillGetData $event
*/
public function includeTagsRelationship(WillGetData $event)
{
if ($event->isController(ShowForumController::class)) {
$event->addInclude(['tags', 'tags.lastPostedDiscussion', 'tags.parent']);
}
}
That would remove two SQL queries per Tag:
SELECT * FROM `discussions` WHERE `discussions`.`id` = '1' LIMIT | runtime: 0.27 ms
SELECT `tags`.*, `discussion_tag`.`discussion_id` as `pivot_discussion_id`, `discussion_tag`.`tag_id` as `pivot_tag_id` FROM `tags` inner join `discussion_tag` on `tags`.`id` = `discussion_tag`.`tag_id` WHERE `discussion_tag`.`discussion_id` = '1' | runtime: 0.43 ms
And it would also remove the size of the payload JSON significantly. Right now every last posted discussion object is included in the resource section.
@w-4 That's a good point. The last posted discussion can be loaded through an HTTP request when one clicks the tags page and/or included only in the /tags route.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. We do this to keep the amount of open issues to a manageable minimum. In any case, thanks for taking an interest in this software and contributing by opening the issue in the first place!
Sharing what I recently found,
So I decided to update fof/clockwork
to see flarum's performance and have a better idea of it. That's when I came across the fact that there were two types of models being loaded for every single primary tag (UserState, Tag) which are relations of the tag's lastPostedDiscussion
discussion model, they are called and used in the DiscussionSerializer
which results in querying them.
Simply eager loading these relationships solves the issue by only using two queries instead of two queries per primary tag. Do we want to make this change or instead just do as posted above and query the lastPostedDiscussions from the frontend while also limiting the number of tags exposed ?
We could include this fix for now since it doesn't hurt. (I posted the PR incase we'd like to do that).
$data['tags'] = Tag::whereVisibleTo($actor)
->withStateFor($actor)
->with([
'parent',
'lastPostedDiscussion',
'lastPostedDiscussion.tags',
'lastPostedDiscussion.state'
])
->get();
Also observation here, we don't actually need to eager load the tags here since we are loading all of them, we could just loop through all the lastPostedDiscussion
relations and set the tags relationship from the collection of tags we already have.
I'd be in favor of including this since those queries are needed anyway, but I wouldn't close this issue until we properly paginate it and only load the tags we need, when we need them.
Infinite nesting PR might have a start to this
Plan for handling this is to do it in 3 stages.
Stage 1: only load top-level primary tags, top 3 secondary tags, and current tag's children/siblings on an arbitrary forum request. This solves the issue of thousands of tags being loaded in at once, which understandably makes things very slow. This is ready for review at https://github.com/flarum/tags/pull/87
Stage 2: Optimize tag permission checking so we don't have to send huge arrays to/from the database. This significantly speeds up all pages (2x at least), and when coupled with Stage 1, makes the effect of tags negligible for every page except the "Tags" page. This is ready for review at https://github.com/flarum/tags/pull/126
Stage 3: Pagination. I propose the following changes:
IndexPage.currentTag
into a util). Add search (and show only 10-20 most recent results) instead of showing ALL of them.Stages 1 and 2 make tags usable on an enterprise-scale forum. Stage 3 enables hundreds of primary and thousands of secondary tags with elegant scaling. Stage 3 will likely be after stable.
Bug Report
Current Behavior
https://github.com/flarum/tags/blob/fa83f496b0004e31ef2f4fc854bfcdf6a4cfa3c3/src/Listener/AddForumTagsRelationship.php#L46-L58
This piece of code is responsible for a performance penalty with forums that have many tags. There are a few things at fault here:
app
on load, even though only 5 (?) are initially shown on the side bar and the index can load them through the api as an include.Suggested solution
What needs to be done is:
or:
Environment