DSpace / dspace-angular

DSpace User Interface built on Angular.io
https://wiki.lyrasis.org/display/DSDOC8x/
BSD 3-Clause "New" or "Revised" License
125 stars 415 forks source link

Consider excluding Search component from Angular SSR (Server Side Rendering) #3231

Open tdonohue opened 1 month ago

tdonohue commented 1 month ago

Describe the bug

This is a subtopic to #3110.

As part of those performance discussions, 4Science mentioned they've found (in DSpace-CRIS) that SSR performance is improved when the Search Component (search.component.ts) is excluded from Server Side Rendering. This would mean that searching will only function for Client Side Rendering (CSR).

See the code change in https://github.com/4Science/dspace-angular/blob/main-cris/src/app/shared/search/search.component.ts#L415-L419

This should have no impact on SEO because well-behaving crawlers all use Sitemaps, and we already exclude /search and search filters in our robots.txt: https://github.com/DSpace/dspace-angular/blob/main/src/robots.txt.ejs#L11

Further testing may need to be done to ensure disabling searching in SSR doesn't have side effects for accessibility (e.g. screen readers). But, my understanding is that it should not, as these tools should all trigger CSR. This is something we can test further once a PR is created and/or offer a configuration to turn this on/off.

Assigning @atarix83 to create a PR.

Related work

Related to #3110 because it should provide benefits to SSR performance.

paulo-graca commented 1 month ago

As I commented here: https://github.com/DSpace/dspace-angular/issues/3110#issuecomment-2265672814

I discourage to do this, or at least, it should be configurable as well as any other part of DSpace Angular we want do disconnect from SSR. As in the example I referred, you can have thematic repositories. If the Search Engine don't assign those keywords to your website, you will lose relevancy. Also, you have the Internet Archives (that are prepared to not necessary follow the sitemap.xml), that if they don't fetch the results, it would not be possible to persist pages in the archives.

tdonohue commented 1 month ago

I agree this should ideally be configurable. I do also want to point out that the purpose for asking for this PR to be created by 4Science is so that others can easily test it out further & report back their findings.

If we find this idea to exclude the search component from SSR is problematic, then perhaps other developers may be inspired by the PR to find a different approach.

But, if we find the PR is generally useful at helping to decrease CPU usage of SSR, then we could consider enabling by default, with an option for sites to disable it if they desire to do so.

This option might also end up being a temporary fix. If we find ways to improve Search performance in general (via #3163 and similar such tickets), then it may make sense to turn SSR back on for search results (by default).

My point is, we should be looking for several types of solutions here. This particular ticket describes a "quick fix" that may provide help to sites that are experiencing CPU spikes for SSR related to bad-behaving or highly active crawlers hitting search results. But, in the future, that "quick fix" might be removed if future performance improvements to search are able to ensure search results don't result in CPU spikes during SSR.