pingcap / docs

TiDB database documentation. TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
https://docs.pingcap.com
Other
585 stars 679 forks source link

Import Example Database does not work with 'tiup client' #6647

Open morgo opened 3 years ago

morgo commented 3 years ago

Change Request

  1. Describe what you find is inappropriate or missing in the existing docs.

This error was reported by @alkaagr81

On https://docs.pingcap.com/tidb/stable/import-example-data

The step to LOAD DATA INFILE does not work with the client tiup client which is recommended on https://docs.pingcap.com/tidb/stable/quick-start-with-tidb

The following error is returned:

my:root@127.0.0.1:4000=> SET tidb_dml_batch_size = 20000;
SET
my:root@127.0.0.1:4000=> LOAD DATA LOCAL INFILE '2017Q1-capitalbikeshare-tripdata.csv' INTO TABLE trips
my:root@127.0.0.1:4000->   FIELDS TERMINATED BY ',' ENCLOSED BY '"'
my:root@127.0.0.1:4000->   LINES TERMINATED BY '\r\n'
my:root@127.0.0.1:4000->   IGNORE 1 LINES
my:root@127.0.0.1:4000-> (duration, start_date, end_date, start_station_number, start_station,
my:root@127.0.0.1:4000(> end_station_number, end_station, bike_number, member_type);
error: mysql: local file '2017Q1-capitalbikeshare-tripdata.csv' is not registered
  1. Describe your suggestion or addition.

This is because the tiup client does not support the file transfer parts of the MySQL protocol. It might be helpful to include a note on this page saying that it requires the real mysql client, or alternatively an example could be given with lightning as well.

(Ideally lightning wouldn't require so many setup steps, since that makes it hard to use as a quickstart.)

  1. Provide some reference materials (documents, websites, etc) if you could.
dveeden commented 2 years ago

Instead of LOAD DATA the usql client has \copy, however it looks like tiup client isn't compiled with support for csvq which makes it impossible to load CSV data.

dveeden commented 2 years ago

We could also modify the procedure like this:

First rename the CSV files:

i=1; for csv in *csv; do mv $csv bikeshare.trips.$(printf "%03d" $i).csv; i=$((i+1)); done

Then create the database and table

CREATE SCHEMA bikeshare;
USE bikeshare;
CREATE TABLE trips (
  `trip_id` BIGINT NOT NULL PRIMARY KEY AUTO_RANDOM,
  `duration` INT NOT NULL,
  `start date` DATETIME,
  `end date` DATETIME,
  `start station number` INT,
  `start station` VARCHAR(255),
  `end station number` INT,
  `end station` VARCHAR(255),
  `bike number` VARCHAR(255),
  `member type` VARCHAR(255)
);

Then create a tidb-lightning.toml file like this:

[tikv-importer]
backend = "tidb"

[mydumper]
no-schema = true
data-source-dir = "/home/dvaneeden/bikeshare-data"

[mydumper.csv]
header = true

[tidb]
host = "127.0.0.1"
port = 4000
user = "root"

And finally run:

tiup tidb-lightning -c tidb-lightning.toml

This:

This could also use the 'local' backend instead of 'tidb' as that could increase performance.