mhogomchungu / media-downloader

Media Downloader is a Qt/C++ front end to yt-dlp, youtube-dl, gallery-dl, lux, you-get, svtplay-dl, aria2c, wget and safari books..
GNU General Public License v2.0
1.6k stars 119 forks source link

[gallery-dl] UnicodeEncodeError #258

Closed CLInewb closed 1 year ago

CLInewb commented 1 year ago

Hello,

I'm glad to have found this neat project, I'm very unfit with commandlines and this works great with yt-dlp, especially the automatic updating is awesome! I was also really excited to see that it supports gallery-dl, because I've heard and wanted to use it many times, but always avoided it due to no GUI.

Unfortunately, trying to use gallery-dl within media-downloader seems to report an 'UnicodeEncodeError' issue, while trying to run gallery-dl via Windows cmd seems to work normally.. Both attempts and the error can be seen here:

help

I would love to use gallery-dl within media-downloader, I find non-GUI installations very difficult and also wouldn't know how to update gallery-dl, ffmpeg etc. in the future. 😢

Here is the verbose output:

[media-downloader] cmd: "C:/Users/null/Downloads/MediaDownloader-2.8.0/local/bin/gallery-dl.exe" "--verbose" "-o" "output.mode=terminal" "https://www.reddit.com/r/ThugSauces/"
[gallery-dl][debug] Starting DownloadJob for 'https://www.reddit.com/r/ThugSauces/'
[reddit][debug] Using RedditSubredditExtractor for 'https://www.reddit.com/r/ThugSauces/'
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): oauth.reddit.com:443
[urllib3.connectionpool][debug] https://oauth.reddit.com:443 "GET /r/ThugSauces/.json?limit=100&raw_json=1 HTTP/1.1" 200 71280
[downloader.ytdl][debug] [Reddit] yc2aj12yo9na1: Downloading m3u8 information
[downloader.ytdl][debug] [Reddit] yc2aj12yo9na1: Downloading MPD manifest
# .\gallery-dl\reddit\ThugSauces\11p98...are getting to creative with these.mp4
[downloader.ytdl][debug] [Reddit] 1rtfztz9a9na1: Downloading m3u8 information
[downloader.ytdl][debug] [Reddit] 1rtfztz9a9na1: Downloading MPD manifest
[reddit][error] An unexpected error occurred: UnicodeEncodeError - 'cp932' codec can't encode character '\U0001f622' in position 74: illegal multibyte sequence. Please run gallery-dl again with the --verbose flag, copy its output and report this issue on https://github.com/mikf/gallery-dl/issues .
[reddit][debug] 
Traceback (most recent call last):
  File "gallery_dl\job.pyc", line 97, in run
  File "gallery_dl\job.pyc", line 141, in dispatch
  File "gallery_dl\job.pyc", line 266, in handle_url
  File "gallery_dl\job.pyc", line 404, in download
  File "gallery_dl\downloader\ytdl.pyc", line 83, in download
  File "gallery_dl\downloader\ytdl.pyc", line 114, in _download_video
  File "gallery_dl\output.pyc", line 366, in start
  File "gallery_dl\output.pyc", line 248, in stdout_write_flush
UnicodeEncodeError: 'cp932' codec can't encode character '\U0001f622' in position 74: illegal multibyte sequence
[media-downloader] Download Failed

Hopefully this is possible to repair! 🥺💦 Thank you very much!

mhogomchungu commented 1 year ago
  1. Your version of Media Downloader is not the latest, update to the latest version first.
  2. Make sure you are also using the latest version of gallery-dl(1.25.0).
  3. What version of windows are you using?
CLInewb commented 1 year ago

Hi, thank you for your reply! I've updated to the latest Media Downloader and gallery versions, and the issue still persists sadly.

Steps from the start: Boot of newly downloaded Media-Downloader ver. Since gallery-dl is not pre-installed, I grabbed the .json from there: https://github.com/mhogomchungu/media-downloader/wiki/Extensions#4-gallery-dl And then added the .json plugin from the 'Configure' Tab. Afterwards, I picked the Dark Theme.

Before rebooting, I copied the console just in case:

[media-downloader] *****************************************************
[media-downloader] To Disable These Checks, Do The Following:-
[media-downloader] 1. Go To "Configure" Tab.
[media-downloader] 2. Go To "General Options" Sub Tab.
[media-downloader] 3. Uncheck "Show Version Info When Starting".
[media-downloader] *****************************************************
[media-downloader] Running in portable mode
[media-downloader] Download path: C:/Users/null/Downloads/MediaDownloader-2.9.0/Downloads
[media-downloader] Checking installed version of OpenSSL
[media-downloader] Found version: OpenSSL 1.1.1j  16 Feb 2021
[media-downloader] Checking installed version of yt-dlp
[media-downloader] Found version: 2023.03.04
[media-downloader] Checking installed version of aria2c
[media-downloader] Found version: 1.36.0
[media-downloader] Checking installed version of wget
[media-downloader] Found version: 1.21.3
[media-downloader] Checking installed version of ffmpeg
[media-downloader] Found version: n5.0-5-g426b7f48d9-20220308
[media-downloader] Checking installed version of aria2c
[media-downloader] Found version: 1.36.0
[media-downloader] Start Downloading gallery-dl ... ... ...
[media-downloader] Downloading: https://github.com/mikf/gallery-dl/releases/download/v1.25.0/gallery-dl.exe
[media-downloader] Destination: C:/Users/null/Downloads/MediaDownloader-2.9.0/local/bin/gallery-dl.exe.tmp
[media-downloader] Downloading gallery-dl: 12,12 MiB / 12,12 MiB (100.00%)
[media-downloader] Download complete
[media-downloader] Renaming file to: C:/Users/null/Downloads/MediaDownloader-2.9.0/local/bin/gallery-dl.exe
[media-downloader] Checking installed version of gallery-dl
[media-downloader] Found version: 1.25.0

Trying to download with gallery-dl, but, it's the same Unicode error as before..

[media-downloader] cmd: "C:/Users/null/Downloads/MediaDownloader-2.9.0/local/bin/gallery-dl.exe" "-o" "output.mode=terminal" "https://www.reddit.com/r/ThugSauces/"
[reddit][info] Requesting public access token
  .\gallery-dl\reddit\ThugSauces\11p98...are getting to creative with these.mp4

100%   7.16MB   1.68MB/s 

* .\gallery-dl\reddit\ThugSauces\11p98...are getting to creative with these.mp4
  .\gallery-dl\reddit\ThugSauces\11pcm9c funny south park clip.mp4

* .\gallery-dl\reddit\ThugSauces\11pcm9c funny south park clip.mp4
  .\gallery-dl\reddit\ThugSauces\11pc9...ing dreamybull in CapCut templates.mp4

* .\gallery-dl\reddit\ThugSauces\11pc9...ing dreamybull in CapCut templates.mp4
[reddit][error] An unexpected error occurred: UnicodeEncodeError - 'cp932' codec can't encode character '\U0001f622' in position 74: illegal multibyte sequence. Please run gallery-dl again with the --verbose flag, copy its output and report this issue on https://github.com/mikf/gallery-dl/issues .
[media-downloader] Download Failed(ErrorCode=1)

Verbose version if needed:

[media-downloader] cmd: "C:/Users/null/Downloads/MediaDownloader-2.9.0/local/bin/gallery-dl.exe" "-o" "output.mode=terminal" "--verbose" "https://www.reddit.com/r/ThugSauces/"
[gallery-dl][debug] Starting DownloadJob for 'https://www.reddit.com/r/ThugSauces/'
[reddit][debug] Using RedditSubredditExtractor for 'https://www.reddit.com/r/ThugSauces/'
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): oauth.reddit.com:443
[urllib3.connectionpool][debug] https://oauth.reddit.com:443 "GET /r/ThugSauces/.json?limit=100&raw_json=1 HTTP/1.1" 200 71421
[downloader.ytdl][debug] [Reddit] yc2aj12yo9na1: Downloading m3u8 information
[downloader.ytdl][debug] [Reddit] yc2aj12yo9na1: Downloading MPD manifest
# .\gallery-dl\reddit\ThugSauces\11p98...are getting to creative with these.mp4
[downloader.ytdl][debug] [Reddit] gfummxg1qana1: Downloading m3u8 information
[downloader.ytdl][debug] [Reddit] gfummxg1qana1: Downloading MPD manifest
# .\gallery-dl\reddit\ThugSauces\11pcm9c funny south park clip.mp4
[downloader.ytdl][debug] [Reddit] f4prr78m3cna1: Downloading m3u8 information
[downloader.ytdl][debug] [Reddit] f4prr78m3cna1: Downloading MPD manifest
# .\gallery-dl\reddit\ThugSauces\11pc9...ing dreamybull in CapCut templates.mp4
[downloader.ytdl][debug] [Reddit] 1rtfztz9a9na1: Downloading m3u8 information
[downloader.ytdl][debug] [Reddit] 1rtfztz9a9na1: Downloading MPD manifest
[reddit][error] An unexpected error occurred: UnicodeEncodeError - 'cp932' codec can't encode character '\U0001f622' in position 74: illegal multibyte sequence. Please run gallery-dl again with the --verbose flag, copy its output and report this issue on https://github.com/mikf/gallery-dl/issues .
[reddit][debug] 
Traceback (most recent call last):
  File "gallery_dl\job.pyc", line 97, in run
  File "gallery_dl\job.pyc", line 141, in dispatch
  File "gallery_dl\job.pyc", line 266, in handle_url
  File "gallery_dl\job.pyc", line 404, in download
  File "gallery_dl\downloader\ytdl.pyc", line 83, in download
  File "gallery_dl\downloader\ytdl.pyc", line 114, in _download_video
  File "gallery_dl\output.pyc", line 366, in start
  File "gallery_dl\output.pyc", line 248, in stdout_write_flush
UnicodeEncodeError: 'cp932' codec can't encode character '\U0001f622' in position 74: illegal multibyte sequence
[media-downloader] Download Failed(ErrorCode=1)

It seems that Media-Downloader with the gallery-dl engine cannot write files that are from posts containing Unicode symbols. (Edit: Although I noticed that it can download videos that have Unicode/Emoji titles with yt-dlp just fine.) On the contrary, gallery-dl with the default Windows cmd worked alright and I was able to download my link.

I tried to look in gallery-dl's instructions if there is an -option to skip unicode characters in filenames, or change how filenames are created, but as usual I can't really comprehend commandline documentations.. (So many settings!)

https://github.com/mikf/gallery-dl/blob/master/docs/formatting.md https://github.com/mikf/gallery-dl/blob/master/docs/configuration.rst

I'm using Windows 10 Version 1809 (Build 17763.4010). It's a 'LTSC' Edition - despite having the latest Windows Updates installed, it differs a little from Home/Pro/Enterprise.

Sorry for the walls of text! But I figured it could help troubleshooting.

mhogomchungu commented 1 year ago

This issue is already reported in gallery-dl bug tracker

https://github.com/mikf/gallery-dl/issues/3765

mhogomchungu commented 1 year ago

In the "download options" text field, add -o output.stdout=utf-8 and try again.

mhogomchungu commented 1 year ago

Can you confirm that it works so that i can close this one.

CLInewb commented 1 year ago

In the "download options" text field, add -o output.stdout=utf-8 and try again.

It works perfectly! Thank you so so much! I saw the issue in gallery-dl beforehand, but since I didn't have any errors with cmd, I didn't think of applying this.

Would love to donate to you! Will reach out the Gmail listed on the documentation as soon as I can.

With the best wishes!