shadowmoose / RedditDownloader

Scrapes Reddit to download media of your choice.
1.12k stars 101 forks source link

Headless support #46

Open Budgiebrain994 opened 5 years ago

Budgiebrain994 commented 5 years ago

I went through the process of setting up RedditDownloader on a Raspberry Pi and hit a few roadblocks. I run it headless without an X server so browser auth was a bit of an issue. I'll just go over it here so others who had the same issues can find a way to set the app up. Firstly want to preface this is an absolutely stellar tool and I'm glad to have it working and love the work done on it. I can see that some issues I found probably won't be easily solvable.

The first issue was that I can't use localhost as a webserver as it will prevent any remote connections from accessing it. That's fine, I just changed it to my local network IP of the Pi in settings.json.

I found that port 7505 conflicts with OpenVPN's management interface, which is unfortunate, so I switched to port 7506. However, the current React OAuth launch logic doesn't support using any other port (just got a blank page) so I disabled OpenVPN just for setup.

Having launched the webserver and navigated to the site, I attempted auth but got some issues from Reddit regarding an invalid URL. It turns out that changing localhost to your local IP causes the redirect_uri parameter to mismatch against RMD's reddit app redirect_uri setting (which is set up for localhost).

This meant that I had to re-use an old developer app I created and change its redirect_uri to my local IP and change the client_key to my app's client key in settings.json. With this I was able to obtain a 302 redirect from Reddit containing the authorization code.

From here, the second step of OAuth was attempted but because I had changed the client key to my own app, and RMD doesn't include a client secret, reddit returned a 401 Unauthorized response. So instead I completed the second step manually via Postman with the client key, client secret, authorization code, state and redirect URI parameters to retrieve a refresh token.

Finally, after plugging in the refresh token into settings.json, further requests were failing as the client secret was still needed. So I modified classes/static/praw_wrapper.py and classes/static/settings.py to accept a new key in settings.json defined as auth.rmd_client_secret which would allow a client secret to be specified and sent through with each request. This allowed full authentication to complete successfully.

From this, I learnt that increasing headless support is going to be very challenging due to the redirect_uri parameter not being manipulatable from the app itself. Much of the work was required as I was using my own developer app to run RMD due to the control required over the redirect_uri parameter.

Thinking through the process now, it would be far more robust to simply set up RMD on another machine with a browser, then move settings.json over to the headless browser. I'm aware rclone supports this kind of behaviour and suggests it in the wizard. Perhaps an idea could be to suggest this to the user through setup. Given they've thought this through and implemented it, it may be the best solution. I don't think reverting to the old RMD behaviour where the user is required to create their own app is sustainable and I appreciate the move away from it.

shadowmoose commented 5 years ago

Hey, thanks for the extensive writeup!

Before I address the specific issues, I want to comment now that RMD's current UI support is certainly not as headless-friendly as I'd prefer. I intend to re-implement some of the shell-based options that were lost as a result of changing the command line options. This has actually been the main reason that the WebUI hasn't made an official release as of yet. I use RMD on an automated setup too, and I want it to work as smoothly as possible. If anybody has any suggested functionality, please let me know!

That said, I'll try to address the rest of your post now:

The localhost interface isn't a perfect default, and I should probably enable some form of first-time startup prompt for it. I want the UI to generally work out of the box for less experienced users, but it might be prudent to let shell-savvy users change this behavior without needing to edit the settings file after launch. My current "solution" of overriding an option in the command line is probably not ideal.

Port 7505 being a conflict with another popular project is really unlucky. I'll probably need to change that default. The port has to match the default oAuth app registered at Reddit, so it should be one that very few popular programs will use. I had thought that port was relatively unused, but it seems I was mistaken. I'm actually surprised with myself that I overlooked OpenVPN.

As for oAuth, it seems you ran into the same problem I did. Since the hostname/port needs to be specified in the registered Reddit Application, the only option to make it work (without registering an app for each user) is to use a predetermined default. I'll probably need to add some advanced config, as you did, to allow users to provide their own registered app credentials as a fallback if they don't have a graphical environment to initially handle the auth.

Rclone's solution was actually what I had in mind as I worked on the auth. The settings file is portable to any machine, and only needs to be authorized once. Requiring a second machine isn't great, but I think it's better than requiring much more complicated setup. That's something I definitely should've documented somewhere. I've made so many changes since the last release, the documentation has fallen behind quite a bit. I'll need to touch that up before release.

Thanks a lot for documenting your experience. It helps me pick up on issues that I've overlooked due to my knowledge of RMD's internal workings.

JBardey commented 5 years ago

For what it's worth I set this up on my NAS headlessly by using an SSH tunnel to present the webUI to a computer with a browser. Creating an account and downloading files from a subreddit worked perfectly. If you're unfamiliar you can create an SSH tunnel with the below and then simply type localhost:7505/index.html into the browser to access. ssh -L 7505:localhost:7505 jason@<nas-ip> Since set the host option in settings to the IP of my NAS and seems to work fine.

Budgiebrain994 commented 5 years ago

Wow. What an elegant solution - I love it. Thank you @JBardey

shadowmoose commented 5 years ago

Yeah, SSH Port Forwarding is a good solution.

The entire UI is built to be remotely accessible, and temporarily opening a remote port to use the "localhost" URI for the initial setup is a great idea. After authentication, RMD can continue to run without the tunneled port, and should never require further OAuth configuration.

I should definitely document that option as well, for headless servers. Thanks for testing that out. I'm keeping notes to add all of this to the new user guide for the upcoming release version. If anybody comes up with anything else that may be useful to document, please feel free to let me know.

In the meantime, I'll leave this issue open so I remember to circle back and document it all before release. Thanks again.