The motivation for this issue is to implement code to encourage search engines to index the website (once it's live at datascijedi.org) in the way we want. For instance, to not use the www subdomain, and to help avoid the datascijedi.netlify.app links from getting indexed. As a specific example, we would want the about web page to be indexed as https://datascijedi.org/about.html, not as https://datascijedi.netlify.app/about.html.
We already have the sitemap.xml, so that helps for indexing purposes. However the Google developer docs recommend that using link tags to specify the canonical link is the better approach. For instance, the about page should have the following in its header:
The problem with the above link tag approach, however, is that it requires editing each page manually. Ideally Quarto would handle this for us. In the mean time, below are two approaches that might be worth a shot for automated insertion.
Use a post-render script to insert the link tag into each HTML file. The sitemap.xml file could be used to find and loop through all the HTML files.
Use a pre-render script to insert the link tags in each qmd file via the Quarto header-includes option. The header-includes (I think) takes whatever value is provided and includes it in the header of the rendered HTML file and is documented here. This approach would require doing a file search to find all the qmd files, so the other approach is probably better.
The motivation for this issue is to implement code to encourage search engines to index the website (once it's live at
datascijedi.org
) in the way we want. For instance, to not use thewww
subdomain, and to help avoid thedatascijedi.netlify.app
links from getting indexed. As a specific example, we would want the about web page to be indexed ashttps://datascijedi.org/about.html
, not ashttps://datascijedi.netlify.app/about.html
.We already have the
sitemap.xml
, so that helps for indexing purposes. However the Google developer docs recommend that usinglink
tags to specify the canonical link is the better approach. For instance, the about page should have the following in its header:<link rel="canonical" href="https://datascijedi.org/about.html" />
The problem with the above
link
tag approach, however, is that it requires editing each page manually. Ideally Quarto would handle this for us. In the mean time, below are two approaches that might be worth a shot for automated insertion.link
tag into each HTML file. Thesitemap.xml
file could be used to find and loop through all the HTML files.link
tags in eachqmd
file via the Quartoheader-includes
option. Theheader-includes
(I think) takes whatever value is provided and includes it in the header of the rendered HTML file and is documented here. This approach would require doing a file search to find all theqmd
files, so the other approach is probably better.There is also a related discussion on Quarto's GitHub discussions.