Open yamamushi opened 7 years ago
As a workaround you could use your own transport by implementing the RoundTripper
interface to set the User-Agent
header, like:
type UserAgentTransport struct {
http.RoundTripper
}
func (c *UserAgentTransport) RoundTrip(r *http.Request) (*http.Response, error) {
r.Header.Set("User-Agent", "<platform>:<app ID>:<version string> (by /u/<reddit username>)")
return c.RoundTripper.RoundTrip(r)
}
func main() {
fp := gofeed.NewParser()
fp.Client = &http.Client{
Transport: &UserAgentTransport{http.DefaultTransport},
}
fp.ParseURL("https://www.reddit.com/r/games/.rss")
}
The <platform>:<app ID>:<version string> (by /u/<reddit username>)
is suggested by the reddit API documentation.
@bogatuadrian Thank you very much. This was really useful!
Expected behavior
Parsing https://www.reddit.com/r/games/.rss should work with an appropriate delay in making requests (Reddit asks for 2 seconds between bot requests).
To further describe the issue, this could be resolved if we had the option of defining our own user-agent strings (or any headers for that matter) when calling gofeed.ParseURL(url string) or when constructing our parser with gofeed.NewParser() .
Actual behavior
Returns 429 Too Many Requests, as Reddit filters requests that do not have user-agent strings.
The first request will work, after which Reddit will block all new requests for a period of time.
Steps to reproduce the behavior
Note: Please include any links to problem feeds, or the feed content itself!