Open toddharding opened 7 years ago
This is definitely a limitation of / bug in the erlang ODBC driver. We tried to support unsized character fields but the erlang ODBC driver mangles the return from them, so we were forced to restrict size to the maximum sized type, wvarchar(4000).
This should be the relevant issue for erlang-odbc. https://bugs.erlang.org/browse/ERL-132
Is there a workaround for retrieving an nvarchar(max) RTF string or other technical solution? This is currently a showstopper for me. Thanks!
If the data is reliably under 4000 bytes a CAST(column AS nvarchar(4000))
on the select should work. If it's not I'm afraid you're out of luck until that issue in Erlang is resolved. I don't think the driver has any way of extracting large data that's working since the bug affects all variable length fields ☹️
We don't have any immediate plans to patch Erlang but if you'd like to give it a try I believe @toddharding from our team had given it a shot and identified at least the broad strokes of what the problem is.
Thank you for the quick reply. Unfortunately that doesn't work for me. Those RTF strings are huge!
The odbc application bug was fixed in a fork https://github.com/arcusfelis/eodbc odbc module became eodbc, and application name was also changed.
So, you can try to run with it. From minuses: eodbc requires unixodbc to compile C code part. While odbc is shipped with Erlang/OTP already compiled.
Nice work! We'll upgrade and hopefully the unixodbc dependency isn't too much trouble for people.
Has this been considered as a possible solution?
http://erlang-odbc-tips.colefichter.ca/
See "Support for NVARCHAR(MAX)" section.
it uses standard odbc application, which has that length bug.
On Tue, 18 Sep 2018 at 22:14, Cháse Pursłey notifications@github.com wrote:
Has this been considered as a possible solution?
http://erlang-odbc-tips.colefichter.ca/
See "Support for NVARCHAR(MAX)" section.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/findmypast-oss/mssql_ecto/issues/2#issuecomment-422557552, or mute the thread https://github.com/notifications/unsubscribe-auth/AAnDNOI7AuRvmN4gI5O2Q6vH1t6OpKMTks5ucWIegaJpZM4Mmltp .
-- С уважением, Уваров Михаил. Best regards, Uvarov Michael
Thanks @arcusfelis
How would I go about installing and using your eodbc fork (debian system)?
And I assume I'd have to fork mssql_ecto to use eodbc instead of erlang-odbc?
You would need to install unixodbc build deps (basically, what you need to install to build Erlang with odbc).
Add eodbc as a dep into rebar.config.
Use eodbc module instead of odbc.
Thanks @arcusfelis
I'm still not clear how I could get eodbc up and running with mssql_ecto / mssqlex.
I've forked mssqlex and replaced :odbc
with :eodbc
where (I beleive) appropriate.
Additionally, I'm installing eodbc in my Pheonix app as follows:
defp deps do
[
{:phoenix, "~> 1.3.4"},
{:phoenix_pubsub, "~> 1.1.0"},
{:phoenix_ecto, "~> 3.4.0"},
{:mssql_ecto, git: "git://github.com/cpursley/mssql_ecto.git", branch: "c/eodbc", override: true},
...
Any other suggestions?
Ok, I was able to get eodbc
loaded by installing it via mix and swapping "odbc" for "eodbc" where appropriate in the mssqlex library. That approach works and it's pretty handy that you can install erlang rebar libraries easily via hex.
However, there are some esoteric errors coming up when running queries where the column type is either varchar(max) or nvarchar(max).
In iex:
{:ok, conn} = Mssqlex.ODBC.init(conn_str)
This works for nvarchar(max). However, the columns are returned as binary.
> :eodbc.sql_query(conn, 'SELECT * FROM dbo.tblSomeTable AS st WHERE st.ID = 1')
=> {:selected,
['ID', 'SomeColumn'],
[[1, <<72, 0, 52, 0, 115, 0, 73, 0, 65, 0, 65, 0, 65, 0, 65, 0, 65, 0, 65, 0, 65, 0, 65, 0, 65, 0, 50, 0, 49, 0, 89, 0, 87, ...>>]
]}
This does not work when varchar(max):
> :eodbc.sql_query(conn, 'SELECT * FROM dbo.tblSomeTable AS st WHERE st.ID = 1')
=> [error] GenServer #PID<0.472.0> terminating
** (stop) {:port_exit, :killed}
Last message: {#Port<0.102>, {:exit_status, 139}}
State: {:state, #Port<0.102>, {#PID<0.389.0>, #Reference<0.425679474.514588673.206201>}, #PID<0.389.0>, :undefined, :off, true, true, :on, :connected, :undefined, 0, [#Port<0.100>, #Port<0.101>], #Port<0.103>, #Port<0.104>}
And via Ecto/Mssqlex, neither varchar(max) nor nvarchar(max) work.
Error for nvarchar(max):
> SomeTable |> Repo.get(1)
=> [error] [79, 68, 66, 67, 58, 32, 114, 101, 99, 101, 105, 118, 101, 100, 32, 117, 110, 101, 120, 112, 101, 99, 116, 101, 100, 32, 105, 110, 102, 111, 58, 32, [123, ['tcp_closed', 44, '#Port<0.62>'], 125], 10]
[error] GenServer #PID<0.378.0> terminating
** (stop) {:port_exit, :could_not_bind_data_buffers}
Last message: {#Port<0.58>, {:exit_status, 22}}
State: {:state, #Port<0.58>, {#PID<0.359.0>, #Reference<0.1246316755.1851523073.245020>}, #PID<0.359.0>, :undefined, :off, true, true, :on, :connected, :undefined, 0, [#Port<0.54>, #Port<0.55>], #Port<0.61>, #Port<0.62>}
:could_not_bind_data_buffers
error appears to be similar to this issue: https://github.com/yandex/clickhouse-odbc/issues/27Error for varchar(max):
SomeTable |> Repo.get(1)
[error] [79, 68, 66, 67, 58, 32, 114, 101, 99, 101, 105, 118, 101, 100, 32, 117, 110, 101, 120, 112, 101, 99, 116, 101, 100, 32, 105, 110, 102, 111, 58, 32, [123, ['tcp_closed', 44, '#Port<0.57>'], 125], 10]
[error] GenServer #PID<0.374.0> terminating
** (stop) {:port_exit, :killed}
Last message: {#Port<0.53>, {:exit_status, 139}}
State: {:state, #Port<0.53>, {#PID<0.358.0>, #Reference<0.2075734545.4001366017.31751>}, #PID<0.358.0>, :undefined, :off, true, true, :on, :connected, :undefined, 0, [#Port<0.49>, #Port<0.50>], #Port<0.56>, #Port<0.57>}
It looks like there's two things that need to happen to get at least nvarchar(max) working:
Does this sound right @arcusfelis & @jbachhardie?
Hmm, I haven't tested the library with binary_strings option off. Generally, you really want to use binaries everywhere for Garbage Collector efficiency.
In MongooseIM we use these connection options:
eodbc:connect(Settings, [{scrollable_cursors, off},{binary_strings, on},{return_types, on}]) https://github.com/esl/MongooseIM/blob/295181f7e9fc5c42b58fdb212ee12be8237cf447/src/rdbms/mongoose_rdbms_odbc.erl
binary_strings=on - return char, nvarchar, binary, nbinary as binaries. Actually, the otp version of odbc library has the option too. But than we can't seamlessly do some processing of unicode fields because both binaries and unicode are now encoded using the same erlang type, binary. We use utf8 binaries as strings in MongooseIM, but MSSQL uses utf16, so some reencoding is required.
So, basically we have added the return_types option and now each query returns not only field names, but also field types. And our code can guess which transformations we should do based on ODBC field types. https://github.com/esl/MongooseIM/blob/295181f7e9fc5c42b58fdb212ee12be8237cf447/src/rdbms/mongoose_rdbms_odbc.erl#L105
On Thu, 27 Sep 2018 at 20:15, Cháse Pursłey notifications@github.com wrote:
Ok, I was able to get eodbc loaded by installing it via mix and swapping "odbc" for "eodbc" where appropriate in the mssqlex library. That approach works and it's pretty handy that you can install erlang rebar libraries easily via hex.
However, there are some esoteric errors coming up when running queries where the column type is either varchar(max) or nvarchar(max).
In iex:
{:ok, conn} = Mssqlex.ODBC.init(conn_str)
This works for nvarchar(max). However, the columns are returned as binary.
:eodbc.sql_query(conn, 'SELECT * FROM dbo.tblSomeTable AS st WHERE st.ID = 1') => {:selected, ['ID', 'SomeColumn'], [[1, <<72, 0, 52, 0, 115, 0, 73, 0, 65, 0, 65, 0, 65, 0, 65, 0, 65, 0, 65, 0, 65, 0, 65, 0, 65, 0, 50, 0, 49, 0, 89, 0, 87, ...>>] ]}
This does not work when varchar(max):
:eodbc.sql_query(conn, 'SELECT * FROM dbo.tblSomeTable AS st WHERE st.ID = 1') => [error] GenServer #PID<0.472.0> terminating ** (stop) {:port_exit, :killed} Last message: {#Port<0.102>, {:exit_status, 139}} State: {:state, #Port<0.102>, {#PID<0.389.0>, #Reference<0.425679474.514588673.206201>}, #PID<0.389.0>, :undefined, :off, true, true, :on, :connected, :undefined, 0, [#Port<0.100>, #Port<0.101>], #Port<0.103>, #Port<0.104>}
And via Ecto/Mssqlex, neither varchar(max) nor nvarchar(max) work.
Error for nvarchar(max):
SomeTable |> Repo.get(1) => [error] [79, 68, 66, 67, 58, 32, 114, 101, 99, 101, 105, 118, 101, 100, 32, 117, 110, 101, 120, 112, 101, 99, 116, 101, 100, 32, 105, 110, 102, 111, 58, 32, [123, ['tcp_closed', 44, '#Port<0.62>'], 125], 10] [error] GenServer #PID<0.378.0> terminating ** (stop) {:port_exit, :could_not_bind_data_buffers} Last message: {#Port<0.58>, {:exit_status, 22}} State: {:state, #Port<0.58>, {#PID<0.359.0>, #Reference<0.1246316755.1851523073.245020>}, #PID<0.359.0>, :undefined, :off, true, true, :on, :connected, :undefined, 0, [#Port<0.54>, #Port<0.55>], #Port<0.61>, #Port<0.62>}
- The :could_not_bind_data_buffers error appears to be similar to this issue: yandex/clickhouse-odbc#27 https://github.com/yandex/clickhouse-odbc/issues/27
- The corresponding exit code in eobcd: https://github.com/arcusfelis/eodbc/blob/748d613846731c3d776c8f22600d8b6d89c26435/src/eodbc_internal.hrl#L161
However, using Repo.query directly works for _nvarchar(max)) (but not varchar):
{:ok, result} = Repo.query("SELECT * FROM dbo.tblSomeTable AS st WHERE st.ID = 1")
Error for varchar(max):
SomeTable |> Repo.get(1)
[error] [79, 68, 66, 67, 58, 32, 114, 101, 99, 101, 105, 118, 101, 100, 32, 117, 110, 101, 120, 112, 101, 99, 116, 101, 100, 32, 105, 110, 102, 111, 58, 32, [123, ['tcp_closed', 44, '#Port<0.57>'], 125], 10] [error] GenServer #PID<0.374.0> terminating ** (stop) {:port_exit, :killed} Last message: {#Port<0.53>, {:exit_status, 139}} State: {:state, #Port<0.53>, {#PID<0.358.0>, #Reference<0.2075734545.4001366017.31751>}, #PID<0.358.0>, :undefined, :off, true, true, :on, :connected, :undefined, 0, [#Port<0.49>, #Port<0.50>], #Port<0.56>, #Port<0.57>}
It looks like there's two things that need to happen to get at least nvarchar(max) working:
- Get odbc:param_query to submitt params correclty
- Decoding binary type
Does this sound right @arcusfelis https://github.com/arcusfelis & @jbachhardie https://github.com/jbachhardie?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/findmypast-oss/mssql_ecto/issues/2#issuecomment-425192423, or mute the thread https://github.com/notifications/unsubscribe-auth/AAnDNOiSoOUDnUeH455RgFTq7a3WksOvks5ufRW_gaJpZM4Mmltp .
-- С уважением, Уваров Михаил. Best regards, Uvarov Michael
Expected Behavior
Text types defined in a migration should be able to store strings of a length greater than 4000.
Current Behavior
When data of greater than length 4000 is inserted the statement is terminated.
Possible Solution
Perhaps a look at the erlang ODBC driver?
Steps to Reproduce (for bugs)
Your Environment