bdkjones / CodeKit

CodeKit 3 Issue Tracker
https://codekitapp.com
81 stars 5 forks source link

Have CodeKit (re-)generate a sitemap.txt file for projects #536

Open greg-raven opened 5 years ago

greg-raven commented 5 years ago

Quick, short summary:

It might be nice to have CodeKit generate a sitemap.txt file for project build files, and re-generate it as files are updated.

Expected results:

Edit files as normal with CodeKit open and that project selected; and have a fresh sitemap.txt file to upload with your new or newly-created files.

Actual results:

n/a

Exact steps to reproduce:

Currently I use either Scrutiny 8 by Peacock Software, or invoke sitemap_gen.py in python locally to generate sitemap.xml files, so sitemap.xml files might be a stretch. However, I generate sitemap.txt files locally with:

grep --include="*.html" -Eir "<title>.+</title>" . \
| sed -E '
    s!^\.!https://www.gregraven.org!
    s!:[[:space:]]*<title>.+</title>[[:space:]]?!!
' \
| sort

Of course, you would already know the build folder so you wouldn't need to hard-code it as I have here.

You probably would want to include a note on the help page to include the noindex meta tag on pages that are included in the sitemap.txt output but are not worthy of search engine indexing:

<meta name="robots" content="noindex">

A link to download a simplified project or file that shows the issue:

https://www.gregraven.org/sitemap.txt

Your configuration (any details about your system that you think might be relevant)

All caught up on macOS and BBEdit, and CodeKit for that matter.

greg-raven commented 5 years ago

I should have mentioned that generating the better-known sitemap.xml files requires python, a copy of Google's (now abandoned?) files (or roll your own, I guess). The Google files can be found at https://sourceforge.net/projects/goog-sitemapgen/. The Google sitemap generator requires a config file in the wild, but maybe you could finesse this.

Once you have a sitemap.txt (or sitemap.xml) file on your site, you can register it with Bing and Google, and/or prompt Bing and/or Google to revisit your sitemap via the appropriate API call:

https://www.google.com/webmasters/sitemap/ping?sitemap=https://www.gregraven.org/sitemap.txt

and

https://www.bing.com/ping?sitemap=https://www.gregraven.org/sitemap.txt

This is something that I would see going into the help documents, rather than being done automatically via CodeKit.