jagrosh / MusicBot

🎶 A Discord music bot that's easy to set up and run yourself!
https://jmusicbot.com
Apache License 2.0
5.4k stars 2.58k forks source link

Add User-Agent Header to Jsoup Connections in Transforms #1668

Open jerichosy opened 3 months ago

jerichosy commented 3 months ago

This pull request...

Description

In transforms, currently a source's url is fetched without specifying user-agent headers. This small PR adds .userAgent("Mozilla") to the line fetching the Document of the url through the Jsoup connection. I hardcoded the value as I saw elsewhere in the codebase doing the same practice. This may be improved by allowing the user-agent to be specified in the configs as part of the transform.

Purpose

When fetching sources in transforms, some servers may block (e.g. 403 Forbidden) due to missing user-agent headers. To fix, set the user-agent to "Mozilla" for the Jsoup connection before fetching the website. This allows roundabout loading from sources that block requests with missing user-agent headers to work.*

*Assuming they accept "Mozilla" as a valid user-agent header. For the source I'm using, it does.

Relevant Issue(s)

N/A (not sure if I should have created an issue first)

jagrosh commented 3 months ago

Instead of a constant value here, maybe we should pick a good default and then allow setting the value within each transform