Japanese text in PrettyTable

GoogleCodeExporter commented 9 years ago

What steps will reproduce the problem?
1. Initial steps if enter asian characters (Japanese) in UTF-8
2.
3.

What is the expected output? What do you see instead?
Expected output = Asian Text
But I see question marks like the following
+----+----+
| ?? | ?? |
+----+----+
| ?? | ?? |
+----+----+

What version of the product are you using? On what operating system?
Latest

Please provide any additional information below.

Original issue reported on code.google.com by kevincob...@gmail.com on 23 May 2012 at 6:50

Attachments:

temp.py

GoogleCodeExporter commented 9 years ago

Thanks for the report.  Can I ask what environment you are trying to do this 
in?  E.g. which operating system, and in what program (IDLE, an X11 terminal 
emulator, DOS prompt, etc?).  I ask because I suspect this problem has 
something to do with the unicode capabilities of whereever you are running the 
code, rather than PrettyTable itself.  I have tried putting Japanese text into 
PrettyTable myself to make sure it worked (I even used the kanji for Tokyo like 
your example, actually!), and found that the results varied depending where I 
tried it.  I know it worked perfectly for me in Python 3 IDLE but not in Python 
2 IDLE (I forget which OS this was).  I've tried it from the Xfce terminal on 
Linux and I got questions marks too, even though Xfce's terminal is supposed to 
support unicode.  I suppose it might also be a font issue.

Original comment by luke@maurits.id.au on 27 May 2012 at 12:11

GoogleCodeExporter commented 9 years ago

I tried on many systems & OS with X11 terminal. 1st & 2nd systems are accessed 
by ssh from the 3rd system by iterm2.app
1st OS is
-------- 
shell   /bin/tcsh
system  SunOS
tcsh    6.12.00
term    xterm
2nd OS
------
SHELL=/bin/bash
TERM=xterm
OSTYPE=linux-gnu
uname = Linux

3rd OS
------
On Mac 10.5 & 10.6, with preinstalled python
term = xterm
uname = Darwin

4th
---
Also Tried on idle. But in idle instead of printing ``?`` it prints nothing 
just the border

Thanks for the reply, if you can let me know some pointers towards font issue 
or anything else would be great.
Such a great Utility not only for tables but also easy borderless aligned 
printing.

Original comment by kevincob...@gmail.com on 27 May 2012 at 2:06

GoogleCodeExporter commented 9 years ago

Hmm, interesting.  Thanks very much for trying those out for me.  I think (but 
I'm not exactly sure) that testing 1 & 2 via ssh from 3 isn't a good way to 
test them, because the output is still ultimately passed through iterm2.app.  
However, from some Googling it looks like iterm2.app should *definitely* be 
able to handle Japanese character output.  Maybe it really is a problem with 
PrettyTable and not your terminal.  I'll look into it further and see if I can 
figure out what's gone wrong, but I won't be able to look at it until this 
coming weekend as I'm very busy with work stuff until then.  Hopefully we can 
get this to work. :)

Original comment by luke@maurits.id.au on 31 May 2012 at 2:31

GoogleCodeExporter commented 9 years ago

Thank you mate.

Original comment by kevincob...@gmail.com on 31 May 2012 at 2:33

GoogleCodeExporter commented 9 years ago

Hi Kevin,

Really quick question: if you do "print x.get_string()" instead of "print x", 
does it print the Japanese correctly?

Original comment by luke@maurits.id.au on 3 Jun 2012 at 5:47

GoogleCodeExporter commented 9 years ago

Yes it does print Japanese correctly. Moreover by just using print x on python 
--version > 3.2 it prints Japanese Correctly. HOWEVER,

In both the above cases, the characters (2 byte) doesn't align properly. 
Following is the output.

+-------+-----------+
|    これ |       解きょ |
+-------+-----------+
| これはなに |        です |
+-------+-----------+

PS: Actually, the alignment of Japanese characters (2 byte) has always been 
troublesome in other python libraries as well (eg, textwrap, str.format(), 
rjust, ljust etc..)

Original comment by kevincob...@gmail.com on 3 Jun 2012 at 9:02

GoogleCodeExporter commented 9 years ago

Moreover, regarding this issue, I asked same question long time ago in a python 
forum. 
Here is the link to the thread of that 
http://www.python-forum.org/pythonforum/viewtopic.php?f=3&t=29241

Thanks

Original comment by kevincob...@gmail.com on 3 Jun 2012 at 9:04

GoogleCodeExporter commented 9 years ago

Ah, good!  This is actually a very silly bug, and it's easy to fix.  Tomorrow I 
will release a 0.6.1 version that has this fixed, so that just "print x" will 
work even in Python 2.x.

I noticed the alignment thing myself when I figured this out.  I'm not quite 
sure how to tackle it, but I'll try to figure something out.  Maybe as a 
temporary fix, you could set the horizonal character to something as wide as 
the Japanese characters, like the kanji for "one" or a katakana hyphen?  That 
might look a little better aligned.

Original comment by luke@maurits.id.au on 3 Jun 2012 at 9:23

GoogleCodeExporter commented 9 years ago

Okay, I've released 0.6.1.  You should find that "print x" works just fine in 
this new release, in Python 2.x or 3.x.

Original comment by luke@maurits.id.au on 3 Jun 2012 at 11:10

Changed state: Fixed

GoogleCodeExporter commented 9 years ago

Hi there,

I thought you might like to know that I have made changes in the SVN trunk 
version of PrettyTable that I think should fix your problem with the alignment 
of Japanese characters.  If you could give it a test and let me know if 
everything looks good, I'd really appreciate it.

Original comment by luke@maurits.id.au on 4 Jul 2012 at 1:41

GoogleCodeExporter commented 9 years ago

Thanks for the update both to me and to the source,

I checked it and alignment is no problems. One small thing which DOESN'T 
EFFECT,  but the only alignment problem now that I should tell you about, lies 
in the header. If Japanese characters are there in the header then the header 
is not aligned however header is in English or even blank then there is no 
trouble. See the following two examples:

#! /usr/bin/env python                                                          

# -*- coding: utf-8 -*-                                                         

from prettytable1 import *
x = PrettyTable(["学会", "何", "これ"]) ###Only this doesn't get aligned 
however not a big issue as can be fixed manually.
x.add_row(["学会学会", "何", "これ"])
x.add_row(["学会", "何何何", "これこれ"])
xadding_width = 2
x.align = "c"
print x

#! /usr/bin/env python                                                          

# -*- coding: utf-8 -*-                                                         

from prettytable1 import *
x = PrettyTable(["Wng", "Word", "Kotoba-of"])
x.add_row(["学会学会", "何", "これ"])
x.add_row(["学会", "何何何", "これこれ"])
xadding_width = 2
x.align = "c"
print x

Original comment by kevincob...@gmail.com on 4 Jul 2012 at 2:14

GoogleCodeExporter commented 9 years ago

Thanks for replying so quickly!  I'm glad it seems to work.  You're right, the 
header is still broken, but that's a very easy fix and I'll make sure it is 
done before the next release (which should be out soon, maybe one week from 
now?).

Original comment by luke@maurits.id.au on 4 Jul 2012 at 2:32

GoogleCodeExporter commented 9 years ago

Hi luke,

I have made a patch for the header bug, following example seems ok:

# -*- coding: utf-8 -*-
import prettytable

x = prettytable.PrettyTable([u'语法', u'操作', u'说明'])
x.add_row([u'set(list1) | set(list2)', u'union', u'包含 list1 和 list2 
所有数据的新集合'])
x.add_row([u'set(list1) & set(list2)', u'intersection', u'包含 list1 2 和 
list2 中共同元素的新集合'])
x.add_row([u'set(list1) - set(list2)', u'difference', u'在 list1 中出 
现但不在 list2  中出现的元素的集合'])
x.align = 'r'

open('test.txt', 'w').write(str(x))

will produce:
+-------------------------+--------------+--------------------------------------
------------+
|                    语法 |         操作 |                                  
           说明 |
+-------------------------+--------------+--------------------------------------
------------+
| set(list1) | set(list2) |        union |             包含 list1 和 list2 
所有数据的新集合 |
| set(list1) & set(list2) | intersection |         包含 list1 2 和 list2 
中共同元素的新集合 |
| set(list1) - set(list2) |   difference | 在 list1 中出 现但不在 list2  
中出现的元素的集合 |
+-------------------------+--------------+--------------------------------------
------------+

Original comment by pt0...@gmail.com on 25 Jul 2012 at 10:10

Attachments:

prettytable.py.patch

GoogleCodeExporter commented 9 years ago

There is one more bit I though I should bring to notice and a temp solution. 
All the above present characters are half width hence the output is beautiful 
and symmetric . 
Although there are many characters which are in full width Usually are
numbers １２３０ and symbols ％＊（）「」 Might not be case with 
other languages charset.
I used jcconv (pypi) to convert such symbols like the following

----------------------------------------------

lines = u"""今回の参加国・地域のうち、中断経験を   *持つ*   
のは１２カ国・地域にのぼる。                                      

戦後日本の長期にわたる繁栄は、国民の九〇％に中産意識��
�   *持た*   
せることができた諸施策に負うところが大で、貧富の差を��
�大する                                                                      

（３）自分の住んでいる地域とのつながりを   *持つ*   。"""

def w2h(jtext):
    return u''.join([jcconv.wide2half(c).strip() for c in jtext])

x = PrettyTable([u'before', u'tw', u'after'])
for line in lines.splitlines():
    b,t,a = line.split('*')
    b = w2h(b)
    t = w2h(t)
    a = w2h(a)
    x.add_row([b,t,a])

print x

Original comment by kevincob...@gmail.com on 25 Jul 2012 at 12:22

GoogleCodeExporter commented 9 years ago

PS the Patch is great thanks

Original comment by kevincob...@gmail.com on 25 Jul 2012 at 12:23

GoogleCodeExporter commented 9 years ago

@pt0079: Thank you very much for the patch, I appreciate your efforts.  
However, I'd actually already fixed the header bug in my working copy of the 
repository but apparently never got around to committing it to trunk.  Very 
sorry about that!  I'll do a commit right now.

@kevincobain2000: I do not think it's right for PrettyTable to be doing things 
like converting １to 1 etc. for users, if that's what you were suggesting.  
PrettyTable's job is to format *what you give it* in a nice table.  What style 
you want your numbers in etc. is up to each individual user and they should 
apply those changes to their data before entering it into PrettyTable.

Original comment by luke@maurits.id.au on 26 Jul 2012 at 7:57

seckcoder / prettytable

Japanese text in PrettyTable #13