findmypast-oss / mssql_ecto

Ecto adapter for Mssqlex
Apache License 2.0
49 stars 20 forks source link

Encoding issues with nvarchar(max) columns? #24

Open sardaukar opened 7 years ago

sardaukar commented 7 years ago

Expected Behavior

I'm using SQL Server 2016 Express, set to SQL_Latin1_General_CP1_CI_AS collation. When I save data on columns set to nvarchar(max) on SQL Server's side, they're properly saved, but when I read them back, I get this:

screen shot 2017-08-17 at 16 29 35

Columns in question here are Analysis and Contents - they should contain "analysis" and "content" respectively.

Current Behavior

Using the scaffolding default views, I managed to save the data (and confirm it's correct in SQL Server Management Studio). Schema definition:

schema "politicaldata" do
    field :analysis, :binary
    field :available, :naive_datetime
    field :contents, :binary
    field :created, :naive_datetime
    field :heading, :string
    field :onhold, :integer
    field :typecode, :integer
    field :url, :string
    field :userid, :integer

    timestamps(inserted_at: :date, updated_at: false)
end

The form to save the data uses textarea as the HTML control.

Possible Solution

Is there a way to set the collation on the Repo settings?

Steps to Reproduce (for bugs)

  1. Save data
  2. Go to index view, see that the nvarchar(max) columns on the database are not set properly.

Context

Trying to get out of the database what I put in 😄

Your Environment

jbachhardie commented 7 years ago

Yeah a bug in the Erlang ODBC driver mangles variable length column output, see https://github.com/findmypast-oss/mssql_ecto/issues/2

Keeping this open since it describes a different and perhpas more common way to encounter this bug but please discuss in the other issue. Basically until someone patches OTP you need to cast the column as nvarchar(4000) on SELECT and are limited to 4000 bytes.