Generation of robots.txt file for SEO

ericerway commented 5 years ago

This issue is for the following packages:

[X] venia-concept
[X] pwa-buildpack
[X] peregrine
[ ] pwa-devdocs
[ ] upward-js
[ ] upward-spec

This issue is a:

[ ] Bug
[ ] Feature suggestion
[ ] Documentation issue
[ ] Other (Please Specify)

Environment

Question	Answer
Magento version
Operating System + version
node.js version (`node -v`)
npm version (`npm -v`)

Description

Magento 2 currently generates a robots.txt file for search engines that uses the default storefront and relevant INDEX, FOLLOW, NOINDEX, other parameters that is updated periodically.

PWA Studio and possibly UPWARD needs a similar function at build/runtime that provides the same including a blank default when this is not available with future ties to the Magento 2 admin.

Expected result:

Solution for creating and updating robots.txt file for PWA with PWA Studio including Venia and future considerations.

Possible solutions:

vitalics commented 5 years ago

The link with the information about robots.txt

Starotitorov commented 5 years ago

@ericerway, I am a little bit confused. User-agent, Allow, Disallow, Sitemap, Clean-param, Crawl-delay are only valid fields for a robots.txt files. At the same time, you can specify <meta name="robots" content="noindex"> inside <head> to tell crawler not to index the page. Also, you can add <meta name="robots" content="nofollow" /> to forbid following all links on the page or <a href="href" rel="nofollow"></a> to forbid following one particular link. So, @ericerway, what should be a result of the implementation?

ericerway commented 5 years ago

This story is not ready to start but we need an answer for robots.txt with PWA including Lighthouse. This needs to be groomed.

ericerway commented 5 years ago

@zetlen is this something we should consider as a "quick fix" for Venia to satisfy our Lighthouse score for SEO? Realize most of this is with server/hosting so may be now.sh or similar.

zetlen commented 5 years ago

Magento generates a robots.txt and we could tell UPWARD to proxy to it. However, the robots.txt file may not recognize the PWA domain or sitemap.

In the meantime, we can create a simple robots.txt for Venia. I'll use Google guidelines to make one.

magento / pwa-studio