Open D3Hunter opened 6 months ago
a)
create a separate import into job for each target table with a thread <= thread param specified in the statement
the meaning of thread = 8
is a bit of ambiguous, maybe total_thread = 8
or thread_per_table = 8
?
b)
consider the query syntax is SHOW IMPORT JOB GROUP
, can we also add GROUP
in IMPORT INTO
? like IMPORT INTO * GROUP FROM
c)
The * in the statement is just a indicator to mean we are importing multiple tables.
Can we move the filter to here? when it's a string liternal parser can know it's ImportIntoGroupStmt, and when it's a identifier it's ImportIntoStmt
d)
create tables if not exists.
This statement will be complex as a parent statement of CREATE TABLE 😂 I don't like this idea
a)
create a separate import into job for each target table with a thread <= thread param specified in the statement
the meaning of
thread = 8
is a bit of ambiguous, maybetotal_thread = 8
orthread_per_table = 8
?b)
consider the query syntax is
SHOW IMPORT JOB GROUP
, can we also addGROUP
inIMPORT INTO
? likeIMPORT INTO * GROUP FROM
c)
The * in the statement is just a indicator to mean we are importing multiple tables.
Can we move the filter to here? when it's a string liternal parser can know it's ImportIntoGroupStmt, and when it's a identifier it's ImportIntoStmt
the description in the pr might change after spec is done and discussed by/with PM, just create this issue to link some prepare pr.
IMPORT INTO * GROUP FROM
hard to read.*.*,!mysql.,!sys.*,!INFORMATION_SCHEMA.,!PERFORMANCE_SCHEMA.*,!METRICS_SCHEMA.,!INSPECTION_SCHEMA.*
, we have to add a special value, such as *
to mean the default to avoid type too muchrestore xxx from xxx
, it also need create tables.@lance6716 b)
consider the query syntax is SHOW IMPORT JOB GROUP, can we also add GROUP in IMPORT INTO? like IMPORT INTO * GROUP FROM Q:What do you mean?
c)
The * in the statement is just a indicator to mean we are importing multiple tables.
Can we move the filter to here? when it's a string liternal parser can know it's ImportIntoGroupStmt, and when it's a identifier it's ImportIntoStmt
Q: Do you mean "import into table_filter from
Feature Request
Is your feature request related to a problem? Please describe:
Describe the feature you'd like:
With import into, we can now import into a single table using physical mode, but in many cases user will dump data using
dumpling
, then import, and there're many tables. Those files are named in this format. It would helpful if we import them together using import into statement too.the syntax of importing multiple tables from files named in
dumpling
style is very similar with import into a single table:The
*
in the statement is just a indicator to mean we are importing multiple tables.This statement works as a syntax sugar for normal
import into
that only imports into single table, it works like this:<file-path>
to get all matched files, and from file name we can get which table they are targeting toimport into
job for each target table with a thread <=thread
param specified in the statementdetached
modeWith the batch id, we can get the progress of this import job batch like:
you can also query jobs contained in this batch by
to cancel the batch, use
it will cancel all jobs that hasn't done yet, i.e. pending or running.
we only support S3/GCS as
<file-path>
temporarily, and tidb_enable_dist_task must be enabled to run this sql.user cannot specified
FORMAT xx
clause explicitly, we will walk all files in<file-path>
and determine it's format from the suffix, and only tables matches withfilter
is imported.The
thread
param only indicate the max cpu usage when importing each table, and as all those jobs runs on TiDB Distributed eXecution Framework (DXF), the actual cpu usage depends how many resource managed by DXF, how many tables to import and the rules we used to assign thread to jobs.Tasks
show import jobs
show import job(s)
, such as all options, thread, whether it's distributed, global sort or not, etc.CREATE
permission, but might not have WRITE permission to the target table.Describe alternatives you've considered:
lightning already support this, but import into works as a SQL, more user friendly, and integrated with global sort, so we would to have to feature in import into too.
Teachability, Documentation, Adoption, Migration Strategy: