johanclasson / vso-agent-tasks

Build and Release Tasks for Visual Studio Online and Team Foundation Server
MIT License
20 stars 16 forks source link

DbUp - Script file that is UTF-8, inserting unicode (Spanish language), inserts garbage #32

Closed sumo300 closed 6 years ago

sumo300 commented 6 years ago

We are currently having an issue where a SQL file, which is UTF-8 and has insert statements inserting into an NVARCHAR column and using the Unicode prefix for strings N'', is still inserting garbage instead of the proper Unicode characters.

I've read a bit on this and the issue was forgetting to use the N'' syntax, but we certainly have that. Is this a problem with the extension or with DbUp?

Note that the script has been generated by SSMS, not manually written, from existing data. When the query is run in SSMS, it executes fine and inserts the correct Unicode characters.

sumo300 commented 6 years ago

Some info I found on this topic:

https://improveandrepeat.com/2016/03/fixing-unicode-characters-when-using-dbup/

johanclasson commented 6 years ago

I tried the example in the link you provided with some code pages and got:

Code Page Result
UTF-8 Gibberish
ISO 8859-1 Ok
UTF-16 LE Ok
UTF-16 BE Ok

I ran it against a SQL database in Azure with collation Finnish_Swedish_CI_AS (not sure if the collation really matter in this case) and through a hosted build server.

johanclasson commented 6 years ago

I just released 1.1.2, that makes it possible to pick the encoding that is used when reading the script files. Hopefylly this fixes your issue!

You find the Script Encoding-picker under Script File Filter in the Script Selection group.

sumo300 commented 6 years ago

@johanclasson Wow! Thanks! We'll try this out. We're also trying the save of the file to UTF-16 as forcing UTF-8 with BOM did not work.

sumo300 commented 6 years ago

@johanclasson This worked swimmingly. I really want to do The Who's YEAAAAAAAAAAAAAAAAH!