baptisteArno / typebot.io

💬 Typebot is a powerful chatbot builder that you can self-host.
https://typebot.io
Other
6.51k stars 1.72k forks source link

Set robots meta tag to enable search and link crawlers #1123

Open scottruzal opened 7 months ago

scottruzal commented 7 months ago

Hi! I'm currently using Typebot in production on a custom domain, and I would like to enable Google's web crawler and Linkedin post scraping to work, however, the following tag in the header of the page disables indexing by default:

`

`

There doesn't seem to be an option to disable this, which would enable me to configure this meta tag manually via custom header code. Is there any chance a toggle could be added to the metadata section to enable admins to turn off the default noindex tag on a typebot?

baptisteArno commented 7 months ago

Linkedin post scraping should work even with this meta tag. This is here to avoid having your bot searchable through search engines like Google. Would you really need this?

baptisteArno commented 7 months ago

By default a bot is empty so search engines have nothing to crawl! So I don't really see the point of enabling it

scottruzal commented 7 months ago

Unfortunately, Linkedin post scraping did not work on the domain due to the meta tag. I ended up embedding the bot on a static web server to get around this, so I don't have a screenshot to share, but Linkedin's post inspector was basically giving me an error message that it would not scrape the page due to the noindex tag.

It is strange for them to do this, because Facebook's share debugger does not have this issue, and the noindex tag is typically only taken into consideration by search engines, so most of the blame here should be on Linkedin.

Despite the page being empty content-wise, it might still be beneficial for search engines to be able to crawl a custom domain for Typebot, including the page meta title and description so that users can have a bot hosted on a custom domain and have it appear in search listings. That said, it is fairly easy to embed the bot on a web server like nginx and point the domain to that. It's just an extra step that could be mitigated if metadata was fully customizable within Typebot settings.