Closed silelmot closed 5 years ago
It's been a while but I will fire this up and take a look. I know gmusicapi likes to pass things around in Unicode so in theory it should work, especially if the artist/album path works.
As far as I can see from my testing, these characters should be handled properly. According to the API docs, a unicode-type string is passed back as the filename when download_song is called. http://unofficial-google-music-api.readthedocs.io/en/latest/reference/musicmanager.html#gmusicapi.clients.Musicmanager.download_song This unicode type is passed to the os.path.join to get the full file path which returns unicode type.
Could this be due to the tool you are using to view the directory or some other external tool?
You could always try the following in a terminal:
import os unicode = u'Ä Ö Ü and ß' path = "/home/
/testalbum joined = os.path.join(path, unicode) print(type(joined)) print(joined)= with open(joined, 'wb') as f: f.write(b'0x01')
See what file name it makes ...
when i try your testscript i get
SyntaxError: Non-ASCII character '\xc3' in file test.py on line 2,
i ran gmpydl on another pc, and there is the same problem.
Stick these two lines at the top of the test script
#!/usr/bin/env python
# -*- coding: utf-8 -*-
Or run them from an interactive python prompt
Also, you could try manually naming the file and check that the file manager can display the character its struggling with.
No problem here with manual renaming. the script still dont work, and i have no idea of python. :(
File "test.py", line 9
print(joined)=
^
SyntaxError: invalid synta
Take the = off the end
File "test.py", line 11
f.write(b'0x01')
^
IndentationError: expected an indented block
yep - if you're not running it interactively then you will need to indent the f.write line with a tab or 4 spaces
ok. i dont know what this shall show me. the last command still dont work, even interactively, but i got printed Ä Ö and Ü the correct way in python itself.
does it create a file in home/testalbum with the correct characters?
no, i think because f.write(b'0x01') still causing an error.
what is the error? Indentation?
yes, still the same, and i tried it in the python-console now, but it gave me the same error as before
if its in the console, then you dont need to add indentation, it should do it for you
i just typed in
import os
unicode = u'Ä Ö Ü and ß'
path = "/home/pc1/testalbum"
joined = os.path.join(path, unicode)
print(type(joined))
print(joined)
with open(joined, 'wb') as f:
f.write(b'0x01')
and got the error
f.write(b'0x01')
^
IndentationError: expected an indented block
Ok if you are putting this into a file (i.e not interactively), then it should look like this https://gist.github.com/stevenewbs/7532b02107d54e5613e02d1422590395
for interactive, just open a terminal, type "python" and then enter each line one by one
ok, thanks for this simple script. now it writes the file and this is ok. every letter is right.
OK so now we know in principle that we can write unicode characters to filenames. Do you know what the character is missing in the image above? I cant quite see in the image but the code looks like 009F (or 9F00) which is either > https://www.fileformat.info/info/unicode/char/009f/index.htm or https://www.compart.com/en/unicode/U+9F00
The next thing to try is to take that character and add it to the unicode string in the script to test writing it.
After a bit of googling it looks like the correct character would be "ß" which is in the example code.
You could try adding
#!/usr/bin/env python
# -*- coding: utf-8 -*-
to the top of gmpydl.py to see if that helps. If not, it might be the way the name of the song is encoded on gpm that differs slightly so that the python can't decode it nicely.... I'm hugely clutching at straws here as you might guess
ok, no that didnt help. and i need to enter "!/usr/bin/env python2" with just python it isnt working at all. but also with python2 i have wrong characters for Ä Ü and ß :(
maybe google and its way to transcode this is the problem? i wil try the original google manager to see how this will be handled there
edit: ok. google music manager does well and saves also the ä (just a quick test)...
ok I have added a few more unicode steps to see if this might help https://gist.github.com/stevenewbs/7532b02107d54e5613e02d1422590395
Give this a try and lets see what happens. Beyond this, moving to python3 might solve all these issues - but thats another story
the testfile works great, but after adding
# -*- coding: utf-8 -*-
enc = sys.getfilesystemencoding()
print(enc)
to gmpydl the output is UTF8, but the files are again wrong. do you have any titles with these characters yourself, and does this only didnt work here at my pcs?
When you say the test file works great - you mean it names the file correctly. But adding those lines above breaks it again?
i just startet test.py and it writes the file "Ä Ö Ü and ß" in the testalbum. this works. but i then added those lines to gmpydl.py and hoped this will now save the musik with the right characters too, but it didnt
ah right so the test file now works correctly. That is great news! Unfortunately there might be a bit more work than just adding those lines to gmpydl itself, but in theory it should be fixable. Glad we managed to find a solution.
Once I have made some updates, I will let you know and you can be the canary for the next version
i dont know if this is a bit of misunderstanding, but also the first test.py from you worked.
Right so both versions of the test script worked just fine and produced the correct characters?
Right!
Ok give this branch a go - I made a few changes to specify unicode a bit more for some strings https://github.com/stevenewbs/gmpydl/blob/Unicode-fix/gmpydl.py
Thanks for your time, but it still doesn't work :(
i've just seen, that the filenames in the logfile are correct.
edit: it seems, like there is smth wrong like the filename is taken in line 202
filename, audio = api.download_song(song['id'])
if i have a "print title" in the downlaod_song-def it is shown the correct way in terminal, a print title or print filename after 202 it is printed with the wrong characters.
after adding
filename = "%s/%02d - %s.mp3" % (path, song['track_number'], song['title'])
after the mentioned part, i get my characteres, even special ones from edward griegs songs like Ânne.
That is an epic spot - I was using the filename provided by GPM and not specifying unicode. Try the Unicode-fix branch of the code now and hopefully it should be fixed
I haven't tested it yet, but i think it maybe a problem now, if songs have slashes and backslashes in their names.
Ok, tested it.
like i thought there are problems with / and \ and other characters (if you use windows)
a fix would be to use
import re
and
filename = re.sub(r'[\\/*"<>\|%\^&]', '_', filename)
after the changed code.
i also deleted the "path" in filename. it gave me a strange file with the whole path in the filename
hey there, in germany as in many other european countries, we have special letters like Ä Ö Ü and ß, they are not saved in the titles of the songs, instead i have weird signs. the names of the folders on the other hand are ok.