ONEARMY / community-platform

A platform to build useful communities that aim to tackle global problems
https://platform.onearmy.earth
MIT License
1.1k stars 370 forks source link

[Optimization] Should not load Discussions if page is loaded by a bot/crawler #3468

Closed mariojsnunes closed 4 months ago

mariojsnunes commented 4 months ago

Is your feature request related to a problem? Please describe. Loading discussions, consumes a lot of unnecessary database reads.

Describe the solution you'd like We can detect if page is being accessed by a bot and not load certain features, like discussions.

Additional context Maybe this could be applied to map pins too? Potential code solution: https://stackoverflow.com/a/33941034

benfurber commented 4 months ago

Thank you for thinking about this @mariojsnunes. Especially as it's coming from our current challenges around high DB reads.

I don't think I agree with the intent of this issue though.

We want all of our content/knowledge to be open and accessible and I don't think we should assume a bot/crawler is bad, even if it's annoying for our database read counts. I'd say that's a necessary cost for the platform.

Two edge though.

  1. Of course if can identify a malicious actor targeting us, we should respond.
  2. The issue of link previews is interesting. I'd assume if we gave them the right OG tags they won't go further? It's expensive for them too to load up all pages. I added an issue for some known gaps here: #3424
mariojsnunes commented 4 months ago

Note that our new discussion module implementation doesn't render the comments unless we expand the section, so any crawler wouldn't index them anyway... But we still load them from the database.

mariojsnunes commented 4 months ago

There is also #3467 As for the platform being "open and accessible", the important info should exist on the howto steps and research updates. I understand some comments can also add value, but some will just add noise... @davehakkens do you have any notion if most comments valuable or noisy?

I see two paths we could take:

  1. Comments should be search indexed a. Change the current discussion implementation so comments always exist on the DOM b. Maybe still do this but only for preview link crawlers (like whatsapp, discord, etc)
  2. Comments should not be search indexed a. Implement #3467
mariojsnunes commented 4 months ago

Dropped in favor of #3467 as discussed at the maintainer meeting