jekyll / jekyll-sitemap

Jekyll plugin to silently generate a sitemaps.org compliant sitemap for your Jekyll site
http://rubygems.org/gems/jekyll-sitemap
MIT License
949 stars 134 forks source link

Avoid overwriting an existing robots.txt #246

Closed jveillet closed 5 years ago

jveillet commented 5 years ago

Hi, and thank you for making this awesome Gem.

Here is a PR that attempts to do one thing: If I have already a robots.txt file with custom values at the root of the Jekyll site, then the Generator will take the content of this custom robots.txt and append it to the robots.txt template.

I am not sure it really fix the issue described in #189, so let me know if something doesn't feel right. Thanks for taking the time to look!

ashmaroli commented 5 years ago

@jveillet Thank you for submitting this PR, but I think the proposed changes does more than necessary to achieve the desired result. To me the suggestion in https://github.com/jekyll/jekyll-sitemap/issues/189#issuecomment-330242639 looks sufficient. Have you tried implementing just that..?

jveillet commented 5 years ago

@ashmaroli Thanks for reviewing. Maybe I've come too far with my implementation, I will modify the PR to just do what was suggested into the related issue.

For what I've tested so far the solution seems to work, I am trying to figure out how to modify the tests in accordance.

My only concern is that by doing this: OK we are not overriding anymore the user-defined robots.txt, but we are not either writing the sitemap URL into it, leaving the user to manually add it into her/his own robots.txt.

ashmaroli commented 5 years ago

we are not either writing the sitemap URL into it, leaving the user to manually add it into her/his own robots.txt

Yes, that's indeed the primary intention here. If a user is maintaining a source-file for their robots.txt we (this plugin) shouldn't bother adding a proper robots file for them. The responsibility is entirely on the user.

The situation here is, say I have a site where all of my "standalone pages" like index.html, about-us.html, etc are sourced from files within a directory named _pages. If the same directory also contains a source-file that generates a file at path _site/robots.txt, then jekyll-sitemap shouldn't bother about its contents. The maximum we should do is document that the following content in their robot's source file will help Jekyll include the proper URL to their sitemap:

---
---

Sitemap: {{ "sitemap.xml" | absolute_url }}
jveillet commented 5 years ago

Sorry for the delay for fixing the test, it took me a while to realize I had to generate the tests in a different directory in order to make it work.

jveillet commented 5 years ago

@ashmaroli Anything else I can do to help move this forward?

ashmaroli commented 5 years ago

@jveillet I've amended your commit messages to correctly reflect the intention of each commit. In addition to that, I've refactored the tests for this pull request to test valid scenarios.

You need not do anything more till further review.

mattr- commented 5 years ago

@jekyllbot: merge +minor