opless / phpliteadmin

Automatically exported from code.google.com/p/phpliteadmin
0 stars 0 forks source link

Option to use character encodings other than UTF-8 #8

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Editing  a field that has something with spanish accented characters like 
¿Le/Te gustaría ir a tomar una copa/café? in it.
2. Saving the change
3.

What is the expected output? What do you see instead?
The data should be displayed as typed above with the necessary accents (spanish 
accents). 

Instead it is shown as something like �Le/Te gustar�a ir a un restaurante?

The accented characters are replaced with a � symbol. When I edit and save 
the record the accents get corrupted.

What version of the product are you using? On what operating system?
phpliteadmin_v1-8.1 hosted on Ubuntu server, displaying in FF4 for Windows XP.

In IE8 the accented characters are replaced with a square symbol (like a 
newline character)

Please provide any additional information below.
I'm writing a tool to help people study Spanish.  I have been using 
SQLiteStudio to create the sqlite2 database and php has been reading the 
accents out for me just fine into my html. I tried using phpLiteAdmin to help 
me modify the database easier, but it's not handling the accented characters 
well. I have attached a the database if you want to have a look at it. The 
table with alot of accents in it is called Level3. 

As a Workaround I changed the html at the top as seen below and it seems to 
work ok now.  However I'm not sure if doing this will cause any other issues to 
occur with the script:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<meta http-equiv='Content-Type' content='text/html; charset=iso-8859-1' />

Original issue reported on code.google.com by steeky...@gmail.com on 12 Apr 2011 at 10:23

Attachments:

GoogleCodeExporter commented 9 years ago
Ps, thanks for a really great script. I have a feeling this will be really 
useful :-)

Original comment by steeky...@gmail.com on 12 Apr 2011 at 10:26

GoogleCodeExporter commented 9 years ago
Definitely interesting. Not quite sure how to fix this... I would think setting 
it to ISO-8859-1 could cause issues for other special characters (though I am 
not knowledgeable in character sets).

I suppose we could add a drop down box to change the character set (both a 
default, and then a per session).

Original comment by ian.aldr...@gmail.com on 13 Apr 2011 at 3:49

GoogleCodeExporter commented 9 years ago
I have the same problem with romanian characters, like: şţăî. To solve this 
I choose ISO-8859-2 (edit the source like steeky..)

You could add a drop-down list where we could choose from different page 
encodings, like UTF-8, ISO-8859-2, ISO-8859-1, Latin 1, etc. Phpmyadmin manages 
to get the encoding for the page from the database encoding, but I don`t think 
this is possible with sqlite. You could for example test the charset that is 
sent by the apache server, for example, and select it. 

Original comment by cara...@gmail.com on 13 Apr 2011 at 11:04

GoogleCodeExporter commented 9 years ago
Hi, I just had a look to see how PHPMyAdmin does it. I previously had no 
problems with characters in in that. PHPMyAdmin on both the servers I use has 
the following code:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" dir="ltr">

Tomorrow in work I will play around with this on phpliteadmin to see what 
effect that would have. 

Original comment by steeky...@gmail.com on 13 Apr 2011 at 11:44

GoogleCodeExporter commented 9 years ago
Yeah, it wouldn't really work on SQLite, because there is no specification of 
charset for databases/tables.

I have a few things planned:
- Global charset option.
- Per-database charset option.
- Per-session charset option.

Original comment by ian.aldr...@gmail.com on 14 Apr 2011 at 2:00

GoogleCodeExporter commented 9 years ago
I work a lot with multiple languages (mostly: arabic and german) and i had in 
the past the problem viewing non-latin chars when using the php "preg_match" 
function (also preg_split, preg_replace, preg_match_all). I always got such 
results (�����).

Later i found out that preg_functions needs the u-modifier (PCRE_UTF8 
http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php ) when 
working with UTF-8 ducuments.

example: preg_match('/müller/ui', $str); 

since than i can sleep better :D

Original comment by teryaki1...@googlemail.com on 3 Nov 2012 at 12:49

GoogleCodeExporter commented 9 years ago
I think the current version does not have that many problems with special 
characters as the version the original poster used back in 2011.
PhpLiteAdmin now uses UTF-8 for the Frontened. In the posts before, meta 
http-equiv examples were posted. In 1.9.3, I added a real HTTP-Header sent by 
phpLiteAdmin to set the charset. This works a lot better with any browser.

I just tried to create a table with the Spanish text posted (¿Le/Te gustaría 
ir a tomar una copa/café?). The table, column, default value and actual value 
all has this text. I could create this easily without any errors and the result 
is as expected. See screenshot.

It would be good if somebody could post an example of something that does not 
work at the moment.

@Teryaki: Very good point. We use preg_* at some places (mostly in alterTable) 
and this might be a problem. But I could not produce any problem here as well. 
I can easily rename the column with a name containing special characters.

Of course we have a problem if the text inside the DB is not UTF-8, e.g. 
because saved in there by another application. Maybe we should try this and see 
if we can do something about this, e.g. allow the user to select the charset 
per-database (as ian had planned).

Original comment by crazy4ch...@gmail.com on 3 Nov 2012 at 3:24

Attachments:

GoogleCodeExporter commented 9 years ago
By the way: For full UTF8-support, we don't only need to check preg_* but also 
any other string-function. There are a lot of multibyte-Functions in php for 
UTF8 nowadays: http://php.net/manual/en/book.mbstring.php

Original comment by crazy4ch...@gmail.com on 3 Nov 2012 at 3:44

GoogleCodeExporter commented 9 years ago
-- I think the current version does not have that many problems with special 
characters --

yep, i didn't notice any error according to non-latin chars in this version.

-- e.g. allow the user to select the charset per-database (as ian had planned)--

I'm not sure if we really need this for sqlite (comparing to mySql)

--There are a lot of multibyte-Functions in php for UTF8 nowadays--

Yah i tried it for few months ago but i noticed that many servers do not 
support this extension unfortunately. (mbstring is a non-default extension. 
http://www.php.net/manual/en/mbstring.installation.php )
any way its one of the best solutions and one can check if extension_loaded().

Greeting

Original comment by teryaki1...@googlemail.com on 3 Nov 2012 at 9:41

GoogleCodeExporter commented 9 years ago
I though mbstring comes with recent PHP by default, but it seems I was wrong. 
That's really a shame. Looks like I need to change things in another OpenSource 
script I wrote (CrazyStat, http://en.christosoft.de/CrazyStat ), because I used 
lots of mb_* there in the last version without checking whether it's available 
by default. Thanks for making me aware of this. (But nobody complained yet, so 
I guess lots of people have it enabled).

I found out than in SQLite, there is a Charset defined per-database that cannot 
be changed after creation of the db. You can find it out using
PRAGMA encoding;
(Note: In phpLiteAdmin 1.9.3, there is a bug that will not show the result of 
this if you run it on the db-level. Either run it on the table-level or use the 
1.9.4 development version which I just created and where this is fixed)
But there seems to be only UTF-8 and UTF-16 (variants) possible:
http://www.sqlite.org/pragma.html#pragma_encoding

This does not seem to work with SQlite2. I guess SQLite2 did not support UTF-8 
(see http://www.sqlite.org/version3.html ).

But of course, setting a charset for a DB is one thing. Inserting data in 
another charset is another thing which I guess is still possible:

"SQLite is not particular about the text it receives and is more than happy to 
process text strings that are not normalized or even well-formed UTF-8 or 
UTF-16. Thus, programmers who want to store IS08859 data can do so using the 
UTF-8 interfaces. As long as no attempts are made to use a UTF-16 collating 
sequence or SQL function, the byte sequence of the text will not be modified in 
any way. "

So apart from what "PRAGMA encoding" returns, we might still have ISO-8859-1 
data in a SQLite3 database. This might cause us some problems.

I guess we really need to allow the user to set a charset manually.

Original comment by crazy4ch...@gmail.com on 4 Nov 2012 at 9:03

GoogleCodeExporter commented 9 years ago
Ok, I was only afraid of not to get lost in the encoding world of old days of 
Mysql, and that’s one of the reasons why sqlite became my choice of database 
(simplicity!).

I’m just a bit careful with Pragma statements within phpLiteAdmin’s code. 
As it says:
    “Specific pragma statements may be removed and others added in future releases of SQLite. There is no guarantee of backwards compatibility.”
And some of statements are already been deprecated (See: list of Pragmas). 

But as I said, I’m not sure about it but before I add anything that might 
change in the future, I prefer to make a list of all special statements and 
offer them as a plugin.

I also won’t bother myself too much with sqlite2 (maybe a little egoist from 
me) because of the big difference to v.3 not only in encoding also Blob etc. 
that’s why I don’t use it in havalite and I believe, 1 or 2 years and its 
done (just silly thoughts :D )

Any way if you think its important to add such a functionality, so nothing to 
lose and maybe we benefit from it

cheers

Original comment by teryaki1...@googlemail.com on 5 Nov 2012 at 1:52

GoogleCodeExporter commented 9 years ago
from attached file 
** This file contains an SQLite 2.1 database **

probably you just need to upgrade to sqlite3 and your issue will go away?

Original comment by ykoro...@gmail.com on 7 Nov 2013 at 4:12

GoogleCodeExporter commented 9 years ago
We don't really have problems with special characters anymore, we consistently 
use UTF-8 and this works nicely.
The remaining issue is that users might still have data with other character 
sets in their db (because some other program inserted it in there).

So we should add an option to change the character encoding in phpLiteAdmin. I 
think per-database makes most sense.

Original comment by crazy4ch...@gmail.com on 15 Jan 2014 at 2:25