amietn / vcsi

Create video contact sheets, thumbnails
MIT License
502 stars 58 forks source link

The file name is Chinese. I encountered an error. #60

Closed Teng closed 5 years ago

Teng commented 5 years ago

It's great. it allows me to sort out my videos faster. I love it. But I have a problem. when the file name is in Chinese, it doesn't work. When I renamed him English, it worked again.

 amy@AMYS-MBP  ~  vcsi 你好_世界.mp4 \
-t \
-w 850 \
-g 3x5 \
--end-delay-percent 20 \
-o output.png
Processing 你好_世界.mp4...
Sampling... 15/15
Composing contact sheet...
Traceback (most recent call last):
  File "/usr/local/bin/vcsi", line 10, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/site-packages/vcsi/__init__.py", line 3, in main
    vcsi.vcsi.main()
  File "/usr/local/lib/python3.7/site-packages/vcsi/vcsi.py", line 1451, in main
    process_file(filename, args)
  File "/usr/local/lib/python3.7/site-packages/vcsi/vcsi.py", line 1532, in process_file
    image = compose_contact_sheet(media_info, selected_frames, args)
  File "/usr/local/lib/python3.7/site-packages/vcsi/vcsi.py", line 862, in compose_contact_sheet
    template_path=args.metadata_template_path)
  File "/usr/local/lib/python3.7/site-packages/vcsi/vcsi.py", line 785, in prepare_metadata_text_lines
    text=remaining_chars)
  File "/usr/local/lib/python3.7/site-packages/vcsi/vcsi.py", line 749, in max_line_length
    text_width = 0 if len(text_chunk) == 0 else metadata_font.getsize(text_chunk)[0]
  File "/usr/local/lib/python3.7/site-packages/PIL/ImageFont.py", line 112, in getsize
    return self.font.getsize(text)
UnicodeEncodeError: 'latin-1' codec can't encode character '\u4f60' in position 0: ordinal not in range(256)

image

amietn commented 5 years ago

I can't reproduce it on my system but I implemented a potential fix in https://github.com/amietn/vcsi/commit/7dd1d01ed30e882db8add3e81cad66fee9222503

Can you try again with the latest git master and let me know if it fixes it?

Teng commented 5 years ago

Unfortunately, it still went wrong. But I'm pretty sure it was an error caused by "Chinese" in the file name, because when I renamed the same file to "English", it worked properly.

I have a video called 你好.mp4, and when I named it Hello.mp4,it worked.

The'\u4f60 in the picture, transcoded through ASCII is , that is, the Chinese file name.

image

I'm not familiar with Python, but this might help you.

https://stackoverflow.com/questions/2688020/how-to-print-chinese-word-in-my-code-using-python

https://stackoverflow.com/questions/35763467/convert-chinese-ascii-string-to-chinese-language-string?rq=1

https://stackoverflow.com/questions/38083376/python-decoding-issue-with-chinese-characters

Thank you for your precious time. best greetings.

I can't reproduce it on my system but I implemented a potential fix in 7dd1d01

Can you try again with the latest git master and let me know if it fixes it?

amietn commented 5 years ago

@Teng I'm trying to setup my system similar to yours so that I can reproduce this bug. Can you provide me with the output of the following python code?

#!/usr/bin/env python3

import locale
import sys

loc = locale.getlocale()
print(loc)

print("stdin:", sys.stdin.encoding)
print("stdout:", sys.stdout.encoding)
print("default:", sys.getdefaultencoding())

Also could you tell me what version of Python you are using?

Teng commented 5 years ago

Hey, it outputs:

('zh_CN', 'UTF-8')
stdin: UTF-8
stdout: UTF-8
default: utf-8
amietn commented 5 years ago

Okay, I believe this happens because the default system font does not support unicode characters. Let's see if this theory is correct.

Can you try running vcsi with a TTF font that supports unicode and let me know if that changes anything? For example:

$ vcsi 你好_世界.mp4 \
-t \
-w 850 \
-g 3x5 \
--end-delay-percent 20 \
-o output.png \
--timestamp-font /Library/Fonts/SOME_FONT_WITH_UNICODE_SUPPORT.ttf \
--metadata-font /Library/Fonts/SOME_FONT_WITH_UNICODE_SUPPORT.ttf
Teng commented 5 years ago

@amietn Hey, that's great. it's working.Thank you for your time..

 $ vcsi 你好_世界.mp4 \
-t \
-w 850 \
-g 3x5 \
--end-delay-percent 20 \
-o output.png \
--timestamp-font /Library/Fonts/Arial_Unicode.ttf \
--metadata-font /Library/Fonts/Arial_Unicode.ttf
Processing 你好_世界.mp4...
Sampling... 15/15
Composing contact sheet...
Cleaning up temporary files...
amietn commented 5 years ago

You're welcome :)

I will change vcsi's behavior so that it tries to load Arial_Unicode.ttf first before falling back to the default font.

Also, I don't have a Mac but it would be interesting to know which system fonts are installed by default. Could you provide me with the output of the following command?

$ ls -l /Library/Fonts

Thank you again!

Teng commented 5 years ago
 ls -l /Library/Fonts
total 252328
-rw-r--r--  1 root  wheel    131264  8 18  2018 Al Nile.ttc
-rw-r--r--  1 root  wheel     79748  8 18  2018 Al Tarikh.ttc
-rw-r--r--  1 root  wheel    188132  8 18  2018 AlBayan.ttc
-rw-r--r--  1 root  wheel   2440176 10 25 06:44 AmericanTypewriter.ttc
-rw-r--r--  1 root  wheel    109700 10 25 06:44 Andale Mono.ttf
-rw-r--r--  1 root  wheel    336452 10 25 06:44 Apple Chancery.ttf
-rw-r--r--  1 root  wheel  15255648  7 13  2017 AppleGothic.ttf
-rw-r--r--  1 root  wheel  18760352  6 18  2017 AppleMyungjo.ttf
-rw-r--r--  1 root  wheel    122556 10 25 06:44 Arial Black.ttf
-rw-r--r--  1 root  wheel    558672 10 25 06:44 Arial Bold Italic.ttf
-rw-r--r--  1 root  wheel    750984 10 25 06:44 Arial Bold.ttf
-rw-r--r--  1 root  wheel    553284 10 25 06:44 Arial Italic.ttf
-rw-r--r--  1 root  wheel    183932 10 25 06:44 Arial Narrow Bold Italic.ttf
-rw-r--r--  1 root  wheel    184420 10 25 06:44 Arial Narrow Bold.ttf
-rw-r--r--  1 root  wheel    184944 10 25 06:44 Arial Narrow Italic.ttf
-rw-r--r--  1 root  wheel    179492 10 25 06:44 Arial Narrow.ttf
-rw-r--r--  1 root  wheel     49296 10 25 06:44 Arial Rounded Bold.ttf
-rw-r--r--  1 root  wheel  23278008 10 25 06:44 Arial Unicode.ttf
-rw-r--r--  1 root  wheel    773236 10 25 06:44 Arial.ttf
-rw-r--r--  1 root  wheel   2325812 10 25 06:44 Athelas.ttc
-rw-r--r--  1 root  wheel    307268  8 18  2018 Ayuthaya.ttf
-rw-r--r--  1 root  wheel    109492  8 18  2018 Baghdad.ttc
-rw-r--r--  1 root  wheel    316524  8 18  2018 Bangla MN.ttc
-rw-r--r--  1 root  wheel    439724  8 18  2018 Bangla Sangam MN.ttc
-rw-r--r--  1 root  wheel   1371312 10 25 06:44 Baskerville.ttc
-rw-r--r--  1 root  wheel     81672  8 18  2018 Beirut.ttc
-rw-r--r--  1 root  wheel    223272 10 25 06:44 BigCaslon.ttf
-rw-r--r--  1 root  wheel    238724 10 25 06:44 Bodoni 72 OS.ttc
-rw-r--r--  1 root  wheel     82788 10 25 06:44 Bodoni 72 Smallcaps Book.ttf
-rw-r--r--  1 root  wheel    427652 10 25 06:44 Bodoni 72.ttc
-rw-r--r--  1 root  wheel     73972 10 25 06:44 Bodoni Ornaments.ttf
-rw-r--r--  1 root  wheel    596236 10 25 06:44 Bradley Hand Bold.ttf
-rw-r--r--  1 root  wheel    222640 10 25 06:44 Brush Script.ttf
-rw-r--r--  1 root  wheel    176872 10 25 06:44 Chalkboard.ttc
-rw-r--r--  1 root  wheel    536756 10 25 06:44 ChalkboardSE.ttc
-rw-r--r--  1 root  wheel    486860 10 25 06:44 Chalkduster.ttf
-rw-r--r--  1 root  wheel   2118332 10 25 06:44 Charter.ttc
-rw-r--r--  1 root  wheel   1240320 10 25 06:44 Cochin.ttc
-rw-r--r--  1 root  wheel    120120 10 25 06:44 Comic Sans MS Bold.ttf
-rw-r--r--  1 root  wheel    135484 10 25 06:44 Comic Sans MS.ttf
-rw-r--r--  1 root  wheel   1266824 10 25 06:44 Copperplate.ttc
-rw-r--r--  1 root  wheel     76120  8 18  2018 Corsiva.ttc
-rw-r--r--  1 root  wheel    506640 10 25 06:44 Courier New Bold Italic.ttf
-rw-r--r--  1 root  wheel    691796 10 25 06:44 Courier New Bold.ttf
-rw-r--r--  1 root  wheel    589900 10 25 06:44 Courier New Italic.ttf
-rw-r--r--  1 root  wheel    684624 10 25 06:44 Courier New.ttf
-rw-r--r--  1 root  wheel     76716 10 25 06:44 DIN Alternate Bold.ttf
-rw-r--r--  1 root  wheel    211528 10 25 06:44 DIN Condensed Bold.ttf
-rw-r--r--  1 root  wheel   3027948  8 18  2018 Damascus.ttc
-rw-r--r--  1 root  wheel     97544  8 18  2018 DecoTypeNaskh.ttc
-rw-r--r--  1 root  wheel   1027920  8 18  2018 Devanagari Sangam MN.ttc
-rw-r--r--  1 root  wheel    409696  8 18  2018 DevanagariMT.ttc
-rw-r--r--  1 root  wheel    722884 10 25 06:44 Didot.ttc
-rw-r--r--  1 root  wheel     93724  8 18  2018 Diwan Kufi.ttc
-rw-r--r--  1 root  wheel   1319880  8 18  2018 Diwan Thuluth.ttf
-rw-r--r--  1 root  wheel    534712  8 18  2018 EuphemiaCAS.ttc
-rw-r--r--  1 root  wheel     91112  8 18  2018 Farah.ttc
-rw-r--r--  1 root  wheel    281512  8 18  2018 Farisi.ttf
-rw-r--r--  1 root  wheel    486592 10 25 06:44 Futura.ttc
-rw-r--r--  1 root  wheel    385768 10 25 06:44 Georgia Bold Italic.ttf
-rw-r--r--  1 root  wheel    363800 10 25 06:44 Georgia Bold.ttf
-rw-r--r--  1 root  wheel    400624 10 25 06:44 Georgia Italic.ttf
-rw-r--r--  1 root  wheel    379588 10 25 06:44 Georgia.ttf
-rw-r--r--  1 root  wheel   1251924 10 25 06:44 GillSans.ttc
-rw-r--r--  1 root  wheel    543228  8 18  2018 Gujarati Sangam MN.ttc
-rw-r--r--  1 root  wheel    384332  8 18  2018 GujaratiMT.ttc
-rw-r--r--  1 root  wheel    201932  8 18  2018 Gurmukhi MN.ttc
-rw-r--r--  1 root  wheel    143344  8 18  2018 Gurmukhi Sangam MN.ttc
-rw-r--r--  1 root  wheel     65540  8 18  2018 Gurmukhi.ttf
-rw-r--r--  1 root  wheel    111124 10 25 06:44 Herculanum.ttf
-rw-r--r--  1 root  wheel     43312 10 25 06:44 Hoefler Text Ornaments.ttf
-rw-r--r--  1 root  wheel   1645292 10 25 06:44 Hoefler Text.ttc
-rw-r--r--  1 root  wheel    817376  8 18  2018 ITFDevanagari.ttc
-rw-r--r--  1 root  wheel    138488 10 25 06:44 Impact.ttf
-rw-r--r--  1 root  wheel    457432  8 18  2018 InaiMathi-MN.ttc
-rw-r--r--  1 root  wheel   3414732 10 25 06:44 Iowan Old Style.ttc
-rw-r--r--  1 root  wheel   1178516  8 18  2018 Kailasa.ttc
-rw-r--r--  1 root  wheel    342980  8 18  2018 Kannada MN.ttc
-rw-r--r--  1 root  wheel    296796  8 18  2018 Kannada Sangam MN.ttc
-rw-r--r--  1 root  wheel    394040  8 18  2018 Kefa.ttc
-rw-r--r--  1 root  wheel    495056  8 18  2018 Khmer MN.ttc
-rw-r--r--  1 root  wheel    188048  8 18  2018 Khmer Sangam MN.ttf
-rw-r--r--  1 root  wheel    676076  8 18  2018 Kokonor.ttf
-rw-r--r--  1 root  wheel    190152  8 18  2018 Krungthep.ttf
-rw-r--r--  1 root  wheel     57728  8 18  2018 KufiStandardGK.ttc
-rw-r--r--  1 root  wheel     96348  8 18  2018 Lao MN.ttc
-rw-r--r--  1 root  wheel     58092  8 18  2018 Lao Sangam MN.ttf
-rw-r--r--  1 root  wheel    557856 10 25 06:44 Luminari.ttf
-rw-r--r--  1 root  wheel    175056  8 18  2018 Malayalam MN.ttc
-rw-r--r--  1 root  wheel    185216  8 18  2018 Malayalam Sangam MN.ttc
-rw-r--r--  1 root  wheel    927292 10 25 06:44 Marion.ttc
-rw-r--r--  1 root  wheel    652248 10 25 06:44 Microsoft Sans Serif.ttf
-rw-r--r--  1 root  wheel    649384  8 18  2018 Mishafi Gold.ttf
-rw-r--r--  1 root  wheel   1426864  8 18  2018 Mishafi.ttf
-rw-r--r--  1 root  wheel    176224  8 18  2018 Mshtakan.ttc
-rw-r--r--  1 root  wheel    207352  8 18  2018 Muna.ttc
-rw-r--r--  1 root  wheel    242060  8 18  2018 Myanmar MN.ttc
-rw-r--r--  1 root  wheel    193720  8 18  2018 Myanmar Sangam MN.ttc
-rw-r--r--  1 root  wheel   7110652  8 18  2018 NISC18030.ttf
-rw-r--r--  1 root  wheel     62716  8 18  2018 Nadeem.ttc
-rw-r--r--  1 root  wheel    157392  8 18  2018 NewPeninimMT.ttc
-rw-r--r--  1 root  wheel    214204  8 18  2018 Oriya MN.ttc
-rw-r--r--  1 root  wheel    180460  8 18  2018 Oriya Sangam MN.ttc
-rw-r--r--  1 root  wheel    315876 10 25 06:44 PTMono.ttc
-rw-r--r--  1 root  wheel   2794628  8 18  2018 PTSans.ttc
-rw-r--r--  1 root  wheel   1164836 10 25 06:44 PTSerif.ttc
-rw-r--r--  1 root  wheel    706700 10 25 06:44 PTSerifCaption.ttc
-rw-r--r--  1 root  wheel   1110432 10 25 06:44 Papyrus.ttc
-rw-r--r--  1 root  wheel    230964 10 25 06:44 Phosphate.ttc
-rw-r--r--  1 root  wheel     75012  8 18  2018 PlantagenetCherokee.ttf
-rw-r--r--  1 root  wheel     66456  8 18  2018 Raanana.ttc
-rw-r--r--  1 root  wheel    487456 10 25 06:44 Rockwell.ttc
-rw-r--r--  1 root  wheel    369156  4 13  2018 STIXGeneral.otf
-rw-r--r--  1 root  wheel    186612  4 13  2018 STIXGeneralBol.otf
-rw-r--r--  1 root  wheel    143764  4 13  2018 STIXGeneralBolIta.otf
-rw-r--r--  1 root  wheel    141844  4 13  2018 STIXGeneralItalic.otf
-rw-r--r--  1 root  wheel     16992  4 13  2018 STIXIntDBol.otf
-rw-r--r--  1 root  wheel     17500  4 13  2018 STIXIntDReg.otf
-rw-r--r--  1 root  wheel     16608  4 13  2018 STIXIntSmBol.otf
-rw-r--r--  1 root  wheel     17416  4 13  2018 STIXIntSmReg.otf
-rw-r--r--  1 root  wheel     16576  4 13  2018 STIXIntUpBol.otf
-rw-r--r--  1 root  wheel     16932  4 13  2018 STIXIntUpDBol.otf
-rw-r--r--  1 root  wheel     16964  4 13  2018 STIXIntUpDReg.otf
-rw-r--r--  1 root  wheel     16856  4 13  2018 STIXIntUpReg.otf
-rw-r--r--  1 root  wheel     16916  4 13  2018 STIXIntUpSmBol.otf
-rw-r--r--  1 root  wheel     17184  4 13  2018 STIXIntUpSmReg.otf
-rw-r--r--  1 root  wheel     54556  4 13  2018 STIXNonUni.otf
-rw-r--r--  1 root  wheel     25508  4 13  2018 STIXNonUniBol.otf
-rw-r--r--  1 root  wheel     41188  4 13  2018 STIXNonUniBolIta.otf
-rw-r--r--  1 root  wheel     46268  4 13  2018 STIXNonUniIta.otf
-rw-r--r--  1 root  wheel     13464  4 13  2018 STIXSizFiveSymReg.otf
-rw-r--r--  1 root  wheel     12628  4 13  2018 STIXSizFourSymBol.otf
-rw-r--r--  1 root  wheel     15656  4 13  2018 STIXSizFourSymReg.otf
-rw-r--r--  1 root  wheel     12924  4 13  2018 STIXSizOneSymBol.otf
-rw-r--r--  1 root  wheel     19004  4 13  2018 STIXSizOneSymReg.otf
-rw-r--r--  1 root  wheel     12636  4 13  2018 STIXSizThreeSymBol.otf
-rw-r--r--  1 root  wheel     15648  4 13  2018 STIXSizThreeSymReg.otf
-rw-r--r--  1 root  wheel     12576  4 13  2018 STIXSizTwoSymBol.otf
-rw-r--r--  1 root  wheel     15620  4 13  2018 STIXSizTwoSymReg.otf
-rw-r--r--  1 root  wheel     21776  4 13  2018 STIXVar.otf
-rw-r--r--  1 root  wheel     14808  4 13  2018 STIXVarBol.otf
-rw-r--r--  1 root  wheel     77604  8 18  2018 Sana.ttc
-rw-r--r--  1 root  wheel    417676  8 18  2018 Sathu.ttf
-rw-r--r--  1 root  wheel    109256 10 25 06:44 Savoye LET.ttc
-rw-r--r--  1 root  wheel   6253688 10 25 06:44 Seravek.ttc
-rw-r--r--  1 root  wheel    531372  8 18  2018 Shree714.ttc
-rw-r--r--  1 root  wheel    391376 10 25 06:44 SignPainter.ttc
-rw-r--r--  1 root  wheel    291508  8 18  2018 Silom.ttf
-rw-r--r--  1 root  wheel    780240  8 18  2018 Sinhala MN.ttc
-rw-r--r--  1 root  wheel    757768  8 18  2018 Sinhala Sangam MN.ttc
-rw-r--r--  1 root  wheel    489984 10 25 06:44 Skia.ttf
-rw-r--r--  1 root  wheel   3599752 10 25 06:44 SnellRoundhand.ttc
-rw-r--r--  1 root  wheel  66933252  8 18  2018 Songti.ttc
-rw-r--r--  1 root  wheel    642396  8 18  2018 SukhumvitSet.ttc
-rw-r--r--  1 root  wheel   2712224 10 25 06:44 SuperClarendon.ttc
-rw-r--r--  1 root  wheel    626928 10 25 06:44 Tahoma Bold.ttf
-rw-r--r--  1 root  wheel    681120 10 25 06:44 Tahoma.ttf
-rw-r--r--  1 root  wheel    188700  8 18  2018 Tamil MN.ttc
-rw-r--r--  1 root  wheel    168628  8 18  2018 Tamil Sangam MN.ttc
-rw-r--r--  1 root  wheel    333016  8 18  2018 Telugu MN.ttc
-rw-r--r--  1 root  wheel    305548  8 18  2018 Telugu Sangam MN.ttc
-rw-r--r--  1 root  wheel    620008 10 25 06:44 Times New Roman Bold Italic.ttf
-rw-r--r--  1 root  wheel    842168 10 25 06:44 Times New Roman Bold.ttf
-rw-r--r--  1 root  wheel    660268 10 25 06:44 Times New Roman Italic.ttf
-rw-r--r--  1 root  wheel    834452 10 25 06:44 Times New Roman.ttf
-rw-r--r--  1 root  wheel    932128 10 25 06:44 Trattatello.ttf
-rw-r--r--  1 root  wheel    137008 10 25 06:44 Trebuchet MS Bold Italic.ttf
-rw-r--r--  1 root  wheel    129360 10 25 06:44 Trebuchet MS Bold.ttf
-rw-r--r--  1 root  wheel    144820 10 25 06:44 Trebuchet MS Italic.ttf
-rw-r--r--  1 root  wheel    138848 10 25 06:44 Trebuchet MS.ttf
-rw-r--r--  1 root  wheel    173132 10 25 06:44 Verdana Bold Italic.ttf
-rw-r--r--  1 root  wheel    153260 10 25 06:44 Verdana Bold.ttf
-rw-r--r--  1 root  wheel    174612 10 25 06:44 Verdana Italic.ttf
-rw-r--r--  1 root  wheel    186188 10 25 06:44 Verdana.ttf
-rw-r--r--  1 root  wheel    370732  8 18  2018 Waseem.ttc
-rw-r--r--  1 root  wheel    124308 10 25 06:44 Webdings.ttf
-rw-r--r--  1 root  wheel     68768 10 25 06:44 Wingdings 2.ttf
-rw-r--r--  1 root  wheel     38308 10 25 06:44 Wingdings 3.ttf
-rw-r--r--  1 root  wheel     86384 10 25 06:44 Wingdings.ttf
-rw-r--r--  1 root  wheel    403264 10 25 06:44 Zapfino.ttf
amietn commented 5 years ago

Much appreciated. Thanks!

Teng commented 5 years ago

I use/ Library/Fonts/Arial_Unicode.ttf, which is called Arial Unicode.ttfin / Library/Fonts/.

In the test I just did, I copied it and renamed it `Arial_Unicode.ttf'. So it's called 'Arial Unicode.ttf' on the default system.

I hope you can understand this sentence in order to cause unnecessary misunderstanding,Because English is not my native language..

amietn commented 5 years ago

No problem. Thank you for clarifying this!

Teng commented 5 years ago

I made a copy from/Library/fonts in the hope of helping you

https://drive.google.com/open?id=13T6kFnUuQG9ko0xtMZ0Tg88b0G5JjiuY

amietn commented 5 years ago

Should be fixed in https://github.com/amietn/vcsi/commit/1a3457b947073285a0b07d0a1d2a356303da943c