akhilrex / podgrab

A self-hosted podcast manager/downloader/archiver tool to download podcast episodes as soon as they become live with an integrated player.
GNU General Public License v3.0
1.55k stars 88 forks source link

UTF-8 normalization #207

Open jtagcat opened 2 years ago

jtagcat commented 2 years ago

Please complete the following information

To Reproduce Steps to reproduce the behavior:

  1. Add feed https://feeds.soundcloud.com/users/soundcloud:users:317878409/sounds.rss
  2. Filename of downloaded: Miljardi-dollari-küsimus-Timo-Rein.mp3
  3. Wanted: Miljardi-dollari-küsimus-Timo-Rein.mp3

(They are different to computers!)

Additional context Syncthing only syncs UTF-8 items. It's annoying elsewhere as well.

akhilrex commented 2 years ago

Is there any specific reason you want to retain the exact file name? There is a rather strict name sanitization logic built so that downloading and saving doesnt break. The name is retained as is in the database as well as the feed that the system generates.

Can you share the use case with me?

jtagcat commented 2 years ago

I don't want to retain the exact filename. I want the result to be UTF-8.

Right now:

  1. podgrab downloads file
  2. Syncthing renames it to UTF-8 (2. -> 3.)
  3. podgrab re-downloads the file
  4. conflict (2 files: Miljardi-dollari-küsimus-Timo-Rein.mp3 and Miljardi-dollari-küsimus-Timo-Rein.mp3 exist (they are different!))