Issues I encountered with the Azure Data Studio "Import flat file wizard" with an on-prem SQL Server

Azure Data Studio Version: 1.30.0
OS Version: Windows 10 20H2 19042.1052

Following the instructions in https://github.com/microsoft/azuredatastudio/issues/2090 I'm filing this as a separate GitHub issue:

Thanks for all the feedback on this feature. Please log new issues for any bugs impacting this scenario.

Also, I've used GitHub Markdown checkboxes next to items that are actual bugs rather than my talking-points.

I am using SQL Server Import 1.3.0, installed globally - with Azure Data Studio 1.30.0 ( 59c4b8e90cf2d3a24bed72623197f10f8d090fdc )
My scenario is that I'm transitioning data from an Azure-hosted MySQL database to (eventually) an Azure SQL database.
However there are schema changes involved and I want to manipulate the data as part of the process as well.
I exported the initial data from MySQL using MySQL Workbench to one CSV file per table: users.csv, messages.csv, etc.csv.
I used MySQL Workbench to generate the CREATE TABLE statements for MySQL, and then I edited the generated SQL to work with MS SQL Server and ran successfully (so at this point my MSSQL database has a near-identical schema design as the MySQL database; the differences are all trivial things like AUTO_INCREMENT vs IDENTITY. In this case, all of the columns' data-types are equivalent (if not entirely identical) between MySQL and MSSQL. No incompatible, MySQL-exclusive or MSSQL-exclusive functionality or SQL language features is being used at this point.
I started off trying to import this CSV file with SSIS's Import Data wizard:
- SSMS > TargetDatabase > Tasks > Import Data... (not "Import Flat File...") > Source: "Flat File"
- Screenshots posted here.
- Unfortunately I couldn't progress due to the wizard complaining about the file's codepage (I don't remember the exact error message, sorry).
I then tried again using the "Import Flat File" menu option, to see how it compares...
- SSMS > TargetDatabase > Tasks > Import Flat File...
- The Introduction page says that it will import the data to an "automatically generated table schema and optionally modify columns" - whereas I was expecting to be able to import directly into an existing table - or being able to manually-define the destination table instead of having to rely on the wizard's automatic-generator's best-guesses for column definitions...
- The "Modify Columns" page, when aided wih "Rich Data Type Detection" enabled, gave me the following:
  - It does seem to get the types for int and bigint columns correct, though it detected all of my tinyint columns as smallint instead - while it incorrectly interpreted a smallint column as bit (because only the value 0 appears in the first 50 rows)
  - It defaulted to nvarchar(50) for all my shorter textual columns, even though many of them all had very short fixed lengths, e.g. char(2).
  - Automatic data-type detection is a fine concept, but it only works well when it reads enough data - and 50 rows is not enough. There really shoud be a way to force it to read every row in the CSV and fit column-types based on all rows' data.
  - The Modify Columns editor needs to allow me to specify COLLATION.
I added my on-prem SQL Server instance and database to Azure Data Studio. So far, so good.

Now, here are the problems I ran into:

[ ] I couldn't figure out how to even open the Import Wizard in the first place!
- I right-clicked on all of the nodes in the Connections + Servers sidebar (e.g. servers, databases, tables) and none of them had an "Import Data" or "Tasks > Import Data" menu item. This is the first place to look - and there's nothing.
- I thoroughly scanned the ADS "Home page" for my on-prem server and there's also no clue of how to import data.
- I tried the "Restore" button at the top of the Home page, but that was for traditional SQL Server backups.
- I went over to the Import Data extension page and saw that half-way down the page was the the heading "How do I start the Flat File Import wizard?" and saw that I had to press Ctrl + I. That is completely non-obvious. I shouldn't have to do any of this!
[ ] After pressing Ctrl + I I got the wizard open (hurrah) I saw that it only has the option to import the data to a new table. It can't import the data into an existing table.
- This makes things especially painful and inconvenient considering I just went through the effort to hand-craft CREATE TABLE statements myself.
- I grimaced and appended a 2 to the table-name and figured I'd do a INSERT INTO dbo.Users ( ... ) SELECT ... FROM dbo.Users2 afterwards.
The Preview page seemed to show that the Import/PROSE system seemed to understand the nature of my data okay so far - or so it seemed, but a few things were amiss:
- [ ] My CSV files had far more than 50 rows (tens of thousands, actually) and the first few hundred rows were not particularly interesting, I really would have appreciated the ability to preview the first ~500 rows - or at least given the option because those rows were the ones likely to trip up the whole system.
- [ ] Why on earth is there a "Refresh" button here?
Unlike the Preview page, the "Step 3: Modify Columns" screen showed that PROSE is nowhere near as intelligent as I had hoped and many of my columns' types were just flat-out wrong...
- It detected E.164-formatted phone-numbers as "float"
- I'd have understood if the values had leading - instead of + and had radix points, but these are very clean E.164 phone numbers (e.g. +4250001234), ADS should be smarter than that - and it should infer type information from the column name.
- [ ] It doesn't recognize NULL literals at all. This means I can't use the Import flat file wizard at all. So many CSV files use unquoted NULL literals instead of empty-fields (e.g. ,'a',NULL,'c', instead of ,'a',,'c',) and they're such a pain to convert.
- This is a major problem because I don't want ADS to import NULL as "NULL" into an nvarchar column - but at least that won't fail, but if I mark a column as int with an explicit checkmark in the "Allow Nulls" checkbox it will still try to convert the NULL literal into an int and it will fail and abort the entire import.
- [ ] The "Primary key" column checkboxes don't remember their checked-state.
- I told it to treat the UserId column as a PK by checking the "Primary Key" checkbox in the column and clicked "Import data" - after it failed due to the NULL-handling issue I clicked "Previous" and I saw the PK columns were unchecked - I tried a few more times and it never remembered the Primary Key checkbox.
- [ ] If a column in the CSV file has a NULL literal then it ignores the actual column type, e.g. I have a UserStatus tinyint NULL column and ADS sees the NULL literal and thinks its an nvarchar(50) NOT NULL column instead of int NULL .
- [ ] ADS doesn't seem to recognize binary data, either as Base16, Base64 or Base64Url, even when Base16 data has an 0x prefix or h suffix.
- [ ] ADS seems to think all textual data is nvarchar. It doesn't consider that if all rows in the CSV file have values of the same length then it might be nchar(n) or char(n) - or that if all values are within ASCII range then to try varchar(n) instead of nvarchar(n).
- [ ] Not to mention there's no way to tell it that it is actually safe to use varchar(n) because the columns are using UTF-8.
- [ ] I note that using the Step 3: Modify Columns page is physically painful for a number of reasons:
  - The solid black rectangular border box of the drop-down list, combined with the black gridlines creates a lot of visual-noise and just looking at the wizard page.
  - The default font used by the data-type drop-down on Windows is too faint and not heavy enough to be pleasant to look at (especially late-at-night) compared to SSMS's default typeface. This is especially galling considering the visual disturbances caused by the excessive dark heavy gridlines.
  - This may be attributable to Electron or Chromium's default settings for ClearType on Windows. The rendered text (screenshot below) seems to only be using greyscale antialiasing and not full subpixel antialiasing).
  - The data-type drop-down is too fiddly: this is a problem when you might have hundreds of columns of data-types to correct or adjust (especially changing date to datetime2(7) for example,
    - I appreciate the filter-as-you-type behaviour, but there's no "Clear text" button.
    - The drop-down isn't tall enough to allow users to quickly click-twice to get to a type - you need to click, scroll, then click again. This gets very tiresome when you have to do it hundreds of times.
    - [ ] As a better alternative, please split-up the data-type column into 3 columns:
      - First column: "Major type", e.g. "Textual", "Numeric", "Date/Time", "Other" I suggest you make this column contain clickable buttons instead of a drop-down given there's only 4-5 choices and means there's no problems of discoverability or a need to ever click more than once (I'm really tired of drop-down boxes and combo-boxes by now, can you tell?).
      - Second column: T-SQL type e.g. nvarchar, char, etc.
      - Third column: Length/Precision/Scale specifiers.
        
        This third column should have a button that lets you auto-select the smallest possible width that can fit all values in the CSV file, and maybe a button to select the next-largest common field width, e.g. 50, 100, 128, 255 (not 256), 512, 1024, 4000, 8000, max).
By now I resorted to using Notepad to remove the NULL literals so ADS importer wouldn't complain about them and I clicked the Previous button to go back to the start of the wizard.
- [ ] My current MySQL Workbench exported CSV file's path was still in the file-name box and I clicked the Browse button to choose the cleaned-up version in the same folder, but the File Open dialog was back in my Windows Profile directory. It didn't think to open in the same folder as the currently selected file (argh! more pain and annoyance for no reason!)
Then I clicked the "Next" button and nothing happened. I clicked a few more times and checked with Procmon to see what was going on and nope, the data import wizard was now stuck.
- I opened the ADS Dev Tools window and there wasn't anything relevant in the console output.
  - I'm not familiar with the ADS Import wizard code, but I think that wizardaPage.ts's public async showPage function was failing because it thought the wizard page was invalid, even though no validation messages were present on the page and all required fields had a value - I assume my movement back-and-forth through the wizard invalidated its state somehow.
At this point I gave up with using ADS to import a simple MySQL CSV file and wrote my own importer using LinqPad and existing NuGet packages for handling CSV files.
1/10. Won't use again until it just works as advertised.

Does this issue occur when all extensions are disabled?: Yes

In this screenshot of the data-type column, the text isn't using subpixel antialiasing and there's too many lines which (for my eyes at least) causes visual-disturbances caused by too much high-contrast on screen.

microsoft / azuredatastudio

Issues I encountered with the Azure Data Studio "Import flat file wizard" with an on-prem SQL Server #16018