Closed nicklayb closed 6 years ago
Could you please post here or send me email with table definition, ecto model schema and migration file. I need to know what actual db and elixir types are involved in the issue
One more thing, could you please tell me what is db collation?
Of course, here they are:
Table:
CREATE TABLE Beta8.dbo.Customers (
ID int NOT NULL IDENTITY(1,1),
CustomerNumber varchar(50),
Name varchar(75),
Address varchar(250),
CountryID int,
StateID int,
CityID int,
ZipCodeID int,
Office varchar(10),
IsIndividual bit,
IsMainCustomer bit,
Blacklist bit,
Prospect bit,
MainCustomerID int,
PhoneNumber varchar(25),
PhoneNumberExtention varchar(8),
Fax varchar(25),
Cell varchar(25),
PhoneOther varchar(25),
Email varchar(100),
WebSite varchar(100),
CustomerStatusID int,
RecoveryRateSingle numeric(22,2),
CharterRecoveryRateID int,
InterestRate numeric(22,2),
GracePeriod int,
OpeningDate datetime,
SubscriptionEndDate datetime,
LanguageID int,
CurrencyID int,
RoundTripKm numeric(22,2),
ContactMethod int,
AccountDirectorID int,
SellerID int,
CustomerServiceManagerID int,
AdministrativeContact varchar(100),
Revenue money,
NbEmpl int,
PaymentsTermsID int,
LocationBillingID int,
NameOnCheck varchar(100),
ReferenceNumber varchar(100),
Payback int,
BillingFrequency int,
ActionID int,
AssignedNewFileID int,
DisplayDebtorsInOrder int,
ChargeCommissionToDebtor bit,
TransferTheCredit bit,
DateCreditFile datetime,
DirectDeposit bit,
BankNumber varchar(50),
TransitNumber varchar(50),
FolioNumber varchar(50),
WebUsername varchar(100),
WebPassword text,
PreauthorizedPaymentPercentage int,
Statistic int,
ReferenceID int,
CentralizedFiles bit,
FileStateID int,
FileCategoryID int,
ProfileID int,
FileStatusID int,
IsAgency bit,
SendInvoiceByEmail bit,
DebtorDefaultWrittenLanguage int,
Note text,
Inactive bit,
DepartmentCustomerID int,
RecallDate datetime,
RecallComments text,
OpenFeeNote varchar(20),
SpecificPhoneNumber varchar(20),
RecallResourceID int,
DontTaxeEquipment bit,
SIFNotAllowed bit,
SendAutomatedEmails bit,
AddEmptyPageToExportLetter bit,
SendAutomatedEmailsNewFile bit,
SendAutomatedEmailsCloseFile bit,
SendAutomatedEmailsChangeStatusFile bit,
SendAutomatedEmailsNewPayment bit,
SendAutomatedEmailsPaymentAgreementCreated bit,
WebInactive bit,
[_IT3ApiToken] varchar(512),
IsCommeleo bit,
MinCapitalToTransfert money,
Exclusive bit,
SuccessRate numeric(22,2),
LastFileOpeningDate datetime,
WrittenLanguage int,
CONSTRAINT PK_Customers PRIMARY KEY (ID),
CONSTRAINT FK_Actions_Customers_ActionID FOREIGN KEY (ActionID) REFERENCES Beta8.dbo.Actions(ID) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT FK_CharterRecoveryRates_Customers_CharterRecoveryRateID FOREIGN KEY (CharterRecoveryRateID) REFERENCES Beta8.dbo.CharterRecoveryRates(ID) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT FK_CountryStateCity_Customers_CityID FOREIGN KEY (CityID) REFERENCES Beta8.dbo.CountryStateCity(ID) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT FK_CountryState_Customers_StateID FOREIGN KEY (StateID) REFERENCES Beta8.dbo.CountryState(ID) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT FK_Country_Customers_CountryID FOREIGN KEY (CountryID) REFERENCES Beta8.dbo.Country(ID) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT FK_Currency_Customers_CurrencyID FOREIGN KEY (CurrencyID) REFERENCES Beta8.dbo.Currency(ID) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT FK_CustomerStatus_Customers_CustomerStatusID FOREIGN KEY (CustomerStatusID) REFERENCES Beta8.dbo.CustomerStatus(ID) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT FK_Customers_Customers_MainCustomerID FOREIGN KEY (MainCustomerID) REFERENCES Beta8.dbo.Customers(ID) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT FK_DebtorsProfiles_Customers_ProfileID FOREIGN KEY (ProfileID) REFERENCES Beta8.dbo.DebtorsProfiles(ID) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT FK_DepartmentsCustomers_Customers_DepartmentCustomerID FOREIGN KEY (DepartmentCustomerID) REFERENCES Beta8.dbo.DepartmentsCustomers(ID) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT FK_FilesCategories_Customers_FileCategoryID FOREIGN KEY (FileCategoryID) REFERENCES Beta8.dbo.FilesCategories(ID) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT FK_FilesStates_Customers_FileStateID FOREIGN KEY (FileStateID) REFERENCES Beta8.dbo.FilesStates(ID) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT FK_FilesStatus_Customers_FileStatusID FOREIGN KEY (FileStatusID) REFERENCES Beta8.dbo.FilesStatus(ID) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT FK_Languages_Customers_LanguageID FOREIGN KEY (LanguageID) REFERENCES Beta8.dbo.Languages(ID) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT FK_LocationsBilling_Customers_LocationBillingID FOREIGN KEY (LocationBillingID) REFERENCES Beta8.dbo.LocationsBilling(ID) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT FK_PaymentsTerms_Customers_PaymentsTermsID FOREIGN KEY (PaymentsTermsID) REFERENCES Beta8.dbo.PaymentsTerms(ID) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT FK_Reference_Customers_ReferenceID FOREIGN KEY (ReferenceID) REFERENCES Beta8.dbo.Reference(ID) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT FK_Resources_Customers_AccountDirectorID FOREIGN KEY (AccountDirectorID) REFERENCES Beta8.dbo.Resources(ID) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT FK_Resources_Customers_AssignedNewFileID FOREIGN KEY (AssignedNewFileID) REFERENCES Beta8.dbo.Resources(ID) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT FK_Resources_Customers_CustomerServiceManagerID FOREIGN KEY (CustomerServiceManagerID) REFERENCES Beta8.dbo.Resources(ID) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT FK_Resources_Customers_RecallResourceID FOREIGN KEY (RecallResourceID) REFERENCES Beta8.dbo.Resources(ID) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT FK_Resources_Customers_SellerID FOREIGN KEY (SellerID) REFERENCES Beta8.dbo.Resources(ID) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT FK_ZipCodes_Customers_ZipCodeID FOREIGN KEY (ZipCodeID) REFERENCES Beta8.dbo.ZipCodes(ID) ON DELETE RESTRICT ON UPDATE RESTRICT
) go
CREATE INDEX IX_LocationBilling ON Beta8.dbo.Customers (LocationBillingID) go
CREATE INDEX IX_Name ON Beta8.dbo.Customers (Name) go;
Schema:
defmodule MZ.Customer do
use Ecto.Schema
schema "Customers" do
field :customer_number, :string, source: :customerNumber
field :name, Tds.VarChar
belongs_to :language, MZ.Language, foreign_key: :languageId
has_many :users, MZ.User, foreign_key: :exId
has_many :files, MZ.File, foreign_key: :customerId
end
end
The collation is : SQL_Latin1_General_CP1_CI_AS
I don't have migrations since I connect to an existing database from our main software which is not our creation.
the issue is that collation and column type for :name
column are not correct. TDS will not threat VarChar
columns as ucs2 strings regardless bytes are "there" since that is same thing MSSQL server does. We had a lot of issues earlier when that was supported ad we tried to threat VarChar
as UCS2 string. Your column name, should be of NVarChar
type since NOT ALL French letters are 1 byte long. NVarChar
is ucs2 string, meaning it will use 2 bytes to store single character. I know this sounds wired, but MS has a lot wired stuff under the hood, it will literally take all bytes you send them and store it to VarChar
column without complaining, it will not try to check if codepoints are valid. Of course, this would have huge performance implications to server itself if they have done it. What is even more odd here is that MS will give you back bytes as is :).
There is nice comparison between VarChar
and NVarChar
types here that illustrates this problem.
Finally, I can't give you nice solution for this, maybe in future we will find way to handle somehow decoding, but there will be always "how other platform" will store VarChar
column in database where elixir (second platform) will read such column. And even if we do it, it will become endless bug fixing. What I could suggest is that you try decode value with Tds.Utils.ucs2_to_utf/1
or use fragment in query from m in Model, .... select: %{name: fragment("cast(? as nvarchar) as name", m.name)}
..... OR create your custom ecto type CustomVarchar and use Tds.Utils.ucs2_to_utf/1
in load function to convert it to utf8 string when String.valid?(value) == false. Then use your custom type in schema instead of Tds.Varchar
. You can copy all code from Tds.Varchar module and just amend load function
You the real MVP!!
Creating my own types as worked perfectly! Here is what I wrote if anyone requires it:
defmodule MZ.EctoTypes.VarChar do
@moduledoc """
Converts TDS Nvarchar to an understandable field for Ecto
"""
@behaviour Ecto.Type
@doc "Returns :varchar atom, so we can use :varchar type in migration files"
def type(), do: :varchar
def cast(value) do
{:ok, convert_ucs2 value}
end
def load(value) do
{:ok, convert_ucs2 value}
end
def dump(value) when is_binary(value) do
{:ok, {convert_ucs2(value), :varchar}}
end
defp convert_ucs2(value) when is_binary(value) do
Tds.Utils.ucs2_to_utf(value)
end
defp convert_ucs2(value) do value end
end
@mjaric Sorry for pinging on an old issue! I had this same challenge in my app with varchar fields. We're also using the same collation SQL_Latin1_General_CP1_CI_AS
I'm curious if this is "fixed" or if there's a different solution for the draft PR https://github.com/elixir-ecto/ecto_sql/pull/184?
I updated my decencies to point to that repo and Tds.Types.VarChar
still resulted in the previous
** (Jason.EncodeError) invalid byte 0xC3 in <<80, 195, 63, 194, 182, 100, 195, 63, 194, 182, 114, 32, 65, 112, 114, 105, 99, 111, 116, 32, 115, 101, 101, 100, 32, 111, 105, 108, 32, 79, 114, 103, 97, 110, 105, 99>>
Also, is the "right answer" that this column should have been an nvarchar type to support accents etc?
The best thing/solution you can do is to change that column into nvarchar
.
Any workaround you try always has an issue. for instance if column is varchar with Latin1 collation and you try to put non latin1 codepoint e,g,
LIKE
will not yield result you want.Again, your best friend is nvarchar
:)
BTW, comparing time required to alter table column from varchar to nvarchar is most likely much shorter than the time you will spend trying to solve the issue leaving column to varchar, and you will have allways some edge case that doesn't work. NVarchar is meant to be used for multilingual support.
Totally, makes sense and all that reflects what I've run into saving those characters back into the database. Appreciate the confirmation!
Also, amazing work on the new version, I put it into a test server last night and so far has been a drop-in replacement!
thanks for the feedback!!! I really appreciate.
My SQL Server database uses a lot of accent (é, è, ï, ê) since major part of our customer base are french. When I'm querying the database for a customer with an accent in his name, the name field is now converted as a binary, for instance:
But with accent
Is there any setting I can add to make sure it handle correctly all the accents encoding? I have tried the following hook as you suggested but did nothing :
Thank you!
Edit I also tried changing my
:string
field to aTsd.VarChar
but does the same