intershop / intershop-pwa

The Intershop PWA is an Angular based progressive web app storefront for the Intershop Commerce Platform.
https://www.intershop.com/progressive-web-app
MIT License
157 stars 82 forks source link

Add configurable blacklist for bad bots to nginx #1527

Closed ghost closed 4 months ago

ghost commented 10 months ago

Is your feature request related to a problem? If yes, please describe it.

There is no predefined blacklist for bad bots in nginx.

Have found related code in ssr-off.conf, but it does not seem to work.

Describe the desired solution.

Add configurable blacklist for bad bots to nginx. Bad bots should receive a 403.

Or even better: Add dynamic blocker that updates the botlist automatically and does not require a DPL. e.g. https://github.com/mitchellkrogza/nginx-ultimate-bad-bot-blocker

Describe alternatives you've considered.

Have implemented a static list for now as described in: https://reggiodigital.com/blog/nginx-rule-blocking-bad-bots/ https://stackoverflow.com/a/24820722

AB#90865

shauke commented 4 months ago

The NGINX of the PWA project is supposed to handle some functionalities that are specific to the PWA, e.g. the caching of SSR rendered pages, multi channel deployments, sitemap mapping, ... We currently try to avoid implementing features that are not specific to the PWA but rather relevant to the general deployment in the PWAs NGINX. This would apply to filtering bad bots or to blocking access to certain internal URLs. There should be some specicif NGINX or the deployments Ingress that handles that and not the PWA.

For that reason I will create an internal ticket that should address your requested feature at another place of the deployment but we will not implement it as a feature of the PWA project for now.

Note;

Have found related code in ssr-off.conf, but it does not seem to work.

I checked the functionality when configuring the NGINX container with SSR="OFF" and could verify that it still works as intended. When disabling SSR the result is a returned response that only includes a loading animation and the client browser will render the complete HTML structure itself. Only the listed bots etc. in ssr-off.conf get a SSR rendered result.