Closed AndreiTS closed 4 years ago
Rust String
type is UTF-8
encoded, so the connection needs to be UTF-8
encoded to allow to store the data from the database correctly in rust String
. As such, this library only connects using UTF-8
charset (It can connect to databases with data stored in other charsets, but they are converted to be sent / received in UTF-8
)
I don't think it's working:
Table:
Output:
But, if I change "PAO" to "PÃO" it stops when it reaches this record:
Output:
Code: https://gist.github.com/AndreiTS/cadb9475a8ac2065ca31def23d7fde05
Can you try to print the error message? I don't see any in the output.
Edit: Or provide the database file for me to test
What firebird version are you using? Didi you set the encoding in the DBeaver configuration?
I've tried writing Pão teste PÃO áã ç
using dbeaver with win1252
encoding and was able to read it in rust.
Firebird 2.5.9
I don't know maybe is something wrong with the database, I will try with a new database tomorrow
win1252 is different from WIN1252?
Should be the same, not really sure what is wrong
I tested with another database (driver: win1252) and the result is the same:
Table:
It only works if a set the driver to UTF8 and change the text again:
Driver set to UTF8: Fixed text:
The charset of "EMPCADASTRO" is None. If the charset is set to win1252, like this: It woks. (encoding is specified to win1252)
I tried to change the charset of other columns but it's not working:
I tried this: https://firebirdsql.org/refdocs/langrefupd25-ddl-charset.html and this command:
ALTER TABLE TB_CLIENTE ALTER COLUMN EMPCADASTRO TYPE VARCHAR(20) CHARACTER SET WIN_1252;
But the charset still is None.
Do you know if there is another way to get this working?
I made some changes to the crate rsfbclient-native (rows.rs) in order to work with columns that have win1252 text but are set to the charset none:
I learned rust last week and this is a "gambiarra", maybe you can do something better with it
Result:
rsfbclient 0.9.0:
rsfbclient 0.8.0 with changes above:
The charset of "EMPCADASTRO" is None. If the charset is set to win1252, like this: It woks. (encoding is specified to win1252)
I tried to change the charset of other columns but it's not working:
I tried this: https://firebirdsql.org/refdocs/langrefupd25-ddl-charset.html and this command:
ALTER TABLE TB_CLIENTE ALTER COLUMN EMPCADASTRO TYPE VARCHAR(20) CHARACTER SET WIN_1252;
But the charset still is None.
Do you know if there is another way to get this working?
You cannot change a the charset of numeric columns, because they don't have it. So the charset always will be 'None'
I was trying to change the charset of a varchar column:
@AndreiTS , maybe the problem is your terminal encoding. You can make a test writing this data directly to a text file?
rsfbclient 0.8.0 with my changes: https://gist.github.com/AndreiTS/afc2938868a4e657da2cc92f1c450605 rsfbclient 0.9.0:
The code is the same, I just changes the dependencies
@AndreiTS , I make this test for check some charsets. You can run in your environment?
rsfbclient 0.8.0 with my changes: https://gist.github.com/AndreiTS/afc2938868a4e657da2cc92f1c450605 rsfbclient 0.9.0:
The code is the same, I just changes the dependencies
Please, post here the backtrace(RUST_BACKTRACE=1 cargo run). I think the problem is in the column name and not in the content.
I modified the test to connect to the same database:
Test with rsfbclient 0.9.0 (no changes): https://gist.github.com/AndreiTS/5bc866b90d9bdff178ff33ce08fe7f7e Test with rsfbclient 0.9.0 (with the changes that I did, they break utf8 text :D): https://gist.github.com/AndreiTS/f9628c0f50092f555825b6e270647ae6
rsfbclient 0.8.0 with my changes: https://gist.github.com/AndreiTS/afc2938868a4e657da2cc92f1c450605 rsfbclient 0.9.0: The code is the same, I just changes the dependencies
Please, post here the backtrace(RUST_BACKTRACE=1 cargo run). I think the problem is in the column name and not in the content.
https://gist.github.com/AndreiTS/6f2b9aadf1824b212021798cdf19aa31
I think the problem is in the column name and not in the content.
Forget. The problem is even with the content.
Test with rsfbclient 0.9.0 (no changes): https://gist.github.com/AndreiTS/5bc866b90d9bdff178ff33ce08fe7f7e
So, with cast works fine.
You can share your database(not all tables) + table schema/DDL?
Test with rsfbclient 0.9.0 (no changes): https://gist.github.com/AndreiTS/5bc866b90d9bdff178ff33ce08fe7f7e
So, with cast works fine.
You can share your database(not all tables) + table schema/DDL?
https://gist.github.com/AndreiTS/51cc172e5b4484b3e7886829b057b271
Test with rsfbclient 0.9.0 (no changes): https://gist.github.com/AndreiTS/5bc866b90d9bdff178ff33ce08fe7f7e
So, with cast works fine. You can share your database(not all tables) + table schema/DDL?
https://gist.github.com/AndreiTS/51cc172e5b4484b3e7886829b057b271
I runned this tests and works fine. You have access to a firebird 3 server?
Did you run the test or the code in the gist? Yes, I have access to a fb3 server
Did you run the test or the code in the gist? Yes, I have access to a fb3 server
Yes. I'm test your gist, but I'm running into a linux environment
I'm using linux too. Distro: Pop OS 20.04
I'm too, Pop OS 20.04
I thought it was a windows environment, because of "CREATE DATABASE 'localhost:D:\usuarios\Administrador5\Desktop\BASE.FDB''.
You can test your gist into a recreated(whit your sql script) database?
I copied the database to a Windows Server to export it using IBExpert, but all the code I ran in linux
If you are running the firebird into a windows server and you client into a linux environment, maybe they have some of conflicts. Some people are recommending to use the ISO88591 instead: https://www.projetoacbr.com.br/forum/topic/29433-lazarus-firebird-caracteres-estranhos/
You can make a test with a full linux environment ?
I'm running firebird inside a docker container:
I uploaded the database and the code to google drive to make things easier: https://drive.google.com/drive/folders/1rt1OQcg28cYY6tr5VDtybqQm4eYP6b4H
I'm running firebird inside a docker container:
I uploaded the database and the code to google drive to make things easier: https://drive.google.com/drive/folders/1rt1OQcg28cYY6tr5VDtybqQm4eYP6b4H
Now i got it, hehe
I'm running firebird inside a docker container:
I uploaded the database and the code to google drive to make things easier: https://drive.google.com/drive/folders/1rt1OQcg28cYY6tr5VDtybqQm4eYP6b4H
For now, you can cast the columns to a UTF8 and this will work. I will investigate what we can do
I think this is a charset/collation problem of database. I made a new column with UTF8 and when I updated the firebird return an error:
ALTER TABLE C000017 ADD GRUPOALT2 Varchar(30) CHARACTER SET UTF8;
commit;
UPDATE C000017 a SET a.GRUPOALT2 = CAST(a.GRUPO as varchar(30) CHARACTER SET utf8);
The error:
SQL Message : -104
Invalid token
Engine Code : 335544849
Engine Message :
Malformed string
Has anyone ever experienced this? @jairinhohw @juarezr
So, i think i understood the problem.
The problem is that the column you are trying to access has a charset set no 'NONE', so to the firebird database, the column is basically the same as a binary column, it does not know that the column is win1252
encoded, so it cannot convert to utf8
to send correctly to the client.
This can be seen when you set the utf8
encoding in the dbeaver and got nonsensical data in the print, this would not happend if the firebird knew the charset of the column, as it would convert it to utf8
and would display fine in dbeaver and rust.
So, i think i understood the problem. The problem is that the column you are trying to access has a charset set no 'NONE', so to the firebird database, the column is basically the same as a binary column, it does not know that the column is
win1252
encoded, so it cannot convert toutf8
to send correctly to the client. This can be seen when you set theutf8
encoding in the dbeaver and got nonsensical data in the print, this would not happend if the firebird knew the charset of the column, as it would convert it toutf8
and would display fine in dbeaver and rust.
Yes exactly. In python with the "fdb" package you can force a charset by passing .charset() in the connection property. I think this crate could have something similar
I think this is a charset/collation problem of database. I made a new column with UTF8 and when I updated the firebird return an error:
ALTER TABLE C000017 ADD GRUPOALT2 Varchar(30) CHARACTER SET UTF8; commit; UPDATE C000017 a SET a.GRUPOALT2 = CAST(a.GRUPO as varchar(30) CHARACTER SET utf8);
The error:
SQL Message : -104 Invalid token Engine Code : 335544849 Engine Message : Malformed string
Has anyone ever experienced this? @jairinhohw @juarezr
If the GRUPO
column was not UTF8
You need to convert to the original charset (probably win1252
in this case) in the cast, not to UTF8
.
~But the easier solution would be to use dbeaver to set the column charset to win1252
for all text columns that need it.~
Edit: It seems you cannot change a column with 'NONE' charset, as the dababase does not know how to convert to the new charset.
You select like this too:
SELECT cast(grupo AS varchar(30) CHARACTER SET WIN_1252) FROM C000017;
If you want to convert the column to a new format:
ALTER TABLE C000017 ADD GRUPOALT2 Varchar(30) CHARACTER SET UTF8;
UPDATE C000017 a SET a.GRUPOALT2 = CAST(a.GRUPO as varchar(30) CHARACTER SET WIN_1252);
After this, you can delete the old column and rename the new one with the old name
Edit: The utf8 in the ALTER TABLE
can be set to WIN_1252 to not mess up possible other clients that expect it to be win1252
encoded and does not set the connection encoding.
The problem is that the database is very big, and the program that is going to use it doesn't work with UTF8 (it's an delphi 7 application). I will try your suggestions
The problem is that the database is very big, and the program that is going to use it doesn't work with UTF8 (it's an delphi 7 application). I will try your suggestions
You can cast it on select only for the rust client, so nothing will be changed.
SELECT cast(grupo AS varchar(30) CHARACTER SET WIN_1252) FROM C000017;
The conversion will work as long as you create a new column with the WIN_1252
charset, which will not change anything in practice, except that the firebird now knows the column is WIN_1252
encoded.
So, i think i understood the problem. The problem is that the column you are trying to access has a charset set no 'NONE', so to the firebird database, the column is basically the same as a binary column, it does not know that the column is
win1252
encoded, so it cannot convert toutf8
to send correctly to the client. This can be seen when you set theutf8
encoding in the dbeaver and got nonsensical data in the print, this would not happend if the firebird knew the charset of the column, as it would convert it toutf8
and would display fine in dbeaver and rust.Yes exactly. In python with the "fdb" package you can force a charset by passing .charset() in the connection property. I think this crate could have something similar
Yesterday I made some tests changing the charset used, but it also didn't work
The conversion will work as long as you create a new column with the
WIN_1252
charset, which will not change anything in practice, except that the firebird now knows the column isWIN_1252
encoded.
In fact, we always must use a default charset instead of none. But, I think that is a common problem(database with 'none' charset).
Using the pure_rust feature didn't work either
@AndreiTS, please. Test this commit
I'm working in the charset support, but your case(the C000017 table) are working now
How can I specify a encoding to connect to a database?
Example in python: