benbalter / sitemap-parser

Ruby Gem to parse sitemaps.org compliant sitemaps
MIT License
29 stars 46 forks source link

Add User-Agent to fetch_remote_sitemap request #29

Closed tamaloa closed 11 months ago

tamaloa commented 11 months ago

This adds a default User-Agent to requests when fetching remote sitemaps. Also, if an error occurs, the response code of the sitemap url is shown.

In our experience there exist many webservers which do not respond with their sitemap if no User-Agent is set. These are a few example sites for which the sitemap previously was NOT retrievable:

With the attached change we could successfull retrieve sitemaps for above examples.

tamaloa commented 11 months ago

Sorry for the failing tests - will look into those (suppose it's the changed exception message)

tamaloa commented 11 months ago

Tests should now be green again

tamaloa commented 11 months ago

And now i also checked for style.

Note I excluded test files from rubocop length check in a separate commit in case you want to cherry pick (and fix the length by dividing the tests).