acgonzales / pydeezer

A package to search and download musics on Deezer.
The Unlicense
53 stars 14 forks source link

Unicode characters not written to filename #26

Open hejops opened 3 years ago

hejops commented 3 years ago

Any non-English characters in a track title are not written to the filename. If there are several such tracks in an album, it will be impossible to download all tracks. One way to work around this would be to add the track number before the track title (which is what smloadr does).

Example: https://www.deezer.com/us/album/245806342

To reproduce:

from pydeezer import Deezer
from pydeezer import Downloader

arl = SOME_ARL
deezer = Deezer(arl=arl)
album = deezer.get_album("245806342")
tracks = [track["id"] for track in album["tracks"]["data"]]
dir = "."

Downloader(
    deezer,
    tracks,
    dir,
    quality=track_formats.MP3_320,
    concurrent_downloads=4,
).start()
hejops commented 3 years ago

What is the use case for cleaning up filenames in this way?

https://github.com/acgonzales/pydeezer/blob/e894b2a14d13bc33d83bb71d75d0f6302357b5d8/pydeezer/util.py#L171-L190

As described above, such an approach probably causes more problems than it solves. At least on a typical NTFS file system, this should be the only filename cleanup necessary:

def clean_filename(filename):
    return re.sub(r'[<>:"/\|?*]', "", filename)