Database system: software(DBMS), hardware and users. focusing on OLTP(On-Line Transaction Processing)
Data Warehouse: Store decision-support data rather than application-oriented data, focusing on OLAP (Online Analytical Processing) eg. Amazon Redshift, Hive
1NF—all underlying fields contain atomic(scalar) values only
2NF—it is 1NF and every non-key field is irreducibly dependent on the primary key
3NF—it is 2NF and all non-key fields are mutually independent.
Concurrency
The lost update problem (丢失修改)
The uncommitted dependency problem (读“脏”数据问题)
The inconsistent analysis problem (不可重复读问题)
Using lock: •Exclusive locks (X locks) / write locks 排它锁 •Shared locks (S locks) / read locks共享锁
DDL DML DCL
DDL: Data Definition Languages create、drop、alter, truncate
DML: Data Manipulation Language insert、delete、udpate、select
DCL: Data Control Language grant、revoke
TCL: Transaction Control Language commit, rollback, savepoint
DELETE, DROP and TRUNCATE
DELETE
DROP
TRUNCATE
DELETE FROM ... WHERE...
DROP TABLE...
TRUNCATE TABLE ...
delete item, table
drop table, index, view,database
truncate table
just data
data and structure
just data (increase from 1)
DML, can rollback
DDL, can’t rollback
DDL, can’t rollback
slow
fastest
fast
delete 和 truncate 仅仅删除表数据, drop 连表数据和表结构一起删除
delete 是 DML 语句,操作完以后如果没有不想提交事务还可以回滚,truncate 和 drop 是 DDL 语句,操作完马上生效,不能回滚
manipulation speed: drop>truncate>delete
JOIN
difference between cross join, left join, right join, inner join, outer join
After inner join, the row number
is more: there are some duplicate numbers in the link
is less: there is null in the link
After left join, the row number is more than a: have duplicate numbers in the link
use group by to reduce the redundancy
Attention
A: SELECT * FROM a INNER JOIN b ON a.id = b.id
and
B: SELECT * FROM a,b WHERE a.id = b.id
Have the same result, but A may be more efficient.
Some databases do not support B, so better not to use this
especially when don’t have where for B, it’s a cross-join…
Index
the detailed explanation
create an index: CREATE INDEX index_name ON table_name (column_list) or
alter table table_name add primary key (column)
MySQL will create a BTree index, and only scan the definite lines, which promotes the efficiency
add explain before select
delete the index:
DROP INDEX [indexName] ON [table_name];
alter table [table_name] drop index [index_name] ;
show the index: SHOW INDEXES FROM table
some types of INDEX: PRIMARY KEY, UNIQUE, FULLTEXT, INDEX
but takes more memory and is hard to maintain
View
a virtual table doesn’t contain data, just save it as a logical sentence
Pros:
simple, don’t need to care about the structure of the table
safe, only to the users who have the authority can visit the view
independence, when the data change, the view doesn’t change
create the view: create view (column_list) as select... with check option
check the information: show create view view_name
modify the view: create or replace view view_name as select...
for view, DML is limited: no distinct, no group by, no order by, no join tables…
delete the view: DROP VIEW [IF EXISTS] view_name [, view_name]
存储UUID值(16字节)比整数(4字节)或甚至大整数(8字节)占用更多存储空间。
调试似乎更难,想象一下表达WHERE id = 'df3b7cb7-6a95-11e7-8846-b05adad3f0ae' 而不是WHERE id = 10
使用UUID值可能会导致性能问题,因为它们的大小和没有被排序。 use uuid_binary
Tips to enhance efficiency
some tips
authority management
create a new user: CREATE USER user_account IDENTIFIED BY password;
show the grant: SHOW GRANTS FOR user_account;
grant the authority: GRANT ...(all) ON database.* TO user_account;
revoke the authority: REVOKE ... ON database.* TO user_account;
Trigger
take effect when using DELETE、INSERT、UPDATE
The difference with Assertion: may trigger one when adding a table, which reduces the efficiency.
Trigger can change the data, but Assertion can’t
Function
Common Table Expressions CTE
with cte_name as...
dual
as a dummy table name in situations where no tables are referenced
https://yancyli1117.github.io/Database-Common-Questions/
Database
Database system: software(DBMS), hardware and users. focusing on OLTP(On-Line Transaction Processing)
Data Warehouse: Store decision-support data rather than application-oriented data, focusing on OLAP (Online Analytical Processing) eg. Amazon Redshift, Hive
Entity-Relationship (ER) Model: Entities, Properties, Relationships
3NF
1NF—all underlying fields contain atomic(scalar) values only
2NF—it is 1NF and every non-key field is irreducibly dependent on the primary key
3NF—it is 2NF and all non-key fields are mutually independent.
Concurrency
The lost update problem (丢失修改) The uncommitted dependency problem (读“脏”数据问题) The inconsistent analysis problem (不可重复读问题)
Using lock: •Exclusive locks (X locks) / write locks 排它锁 •Shared locks (S locks) / read locks共享锁
DDL DML DCL
DDL: Data Definition Languages create、drop、alter, truncate
DML: Data Manipulation Language insert、delete、udpate、select
DCL: Data Control Language grant、revoke
TCL: Transaction Control Language commit, rollback, savepoint
DELETE, DROP and TRUNCATE
delete 和 truncate 仅仅删除表数据, drop 连表数据和表结构一起删除 delete 是 DML 语句,操作完以后如果没有不想提交事务还可以回滚,truncate 和 drop 是 DDL 语句,操作完马上生效,不能回滚 manipulation speed: drop>truncate>delete
JOIN
difference between cross join, left join, right join, inner join, outer join
After inner join, the row number
is more: there are some duplicate numbers in the link is less: there is null in the link
After left join, the row number is more than a: have duplicate numbers in the link
use group by to reduce the redundancy
Attention
A: SELECT * FROM a INNER JOIN b ON a.id = b.id
and B: SELECT * FROM a,b WHERE a.id = b.id Have the same result, but A may be more efficient.
Some databases do not support B, so better not to use this
especially when don’t have where for B, it’s a cross-join…
Index
the detailed explanation
create an index: CREATE INDEX index_name ON table_name (column_list) or
alter table table_name add primary key (column)
MySQL will create a BTree index, and only scan the definite lines, which promotes the efficiency
add explain before select
delete the index:
DROP INDEX [indexName] ON [table_name]; alter table [table_name] drop index [index_name] ;
show the index: SHOW INDEXES FROM table
some types of INDEX: PRIMARY KEY, UNIQUE, FULLTEXT, INDEX
but takes more memory and is hard to maintain
View
a virtual table doesn’t contain data, just save it as a logical sentence
Pros:
simple, don’t need to care about the structure of the table safe, only to the users who have the authority can visit the view independence, when the data change, the view doesn’t change
create the view: create view(column_list) as select... with check option
check the information: show create view view_name
modify the view: create or replace view view_name as select...
for view, DML is limited: no distinct, no group by, no order by, no join tables…
delete the view: DROP VIEW [IF EXISTS] view_name [, view_name]
attention: 对视图进行delete、update、insert等dml操作,原表同样会更新
Add a new row or column
CREATE
database, view, table
UPDATE
UPDATE [LOW_PRIORITY] [IGNORE] table_name SET column_name1 = expr1(or select...), column_name2 = expr2, ... [WHERE condition];
INSERT
insert…value and insert…set
INSERT INTO [ ...] ]VALUES ... ]
INSERT INTO SET = , = ...
ALTER
ALTER TABLE table_name action1[,action2,…]
can change the structure of the table: add col, drop col, modify the primary key, type, rename…
add a new column: ALTER table table_name... ADD COLUMN ....after...
delete a column: ALTER TABLE tasks DROP COLUMN description
add auto_increment: ALTER TABLE ... CHANGE COLUMN col ...type... AUTO_INCREMENT;
AUTO_INCREMENT
when the increment run out: change the type int to BigInt, or separate the database
now the id is 1, 2, 3, 4
delete 4: the id is 5 next, not 4
then restart MySQL, id is 4: the increment is in the memory, not the data log
insert id 10, the next: 11
MySQL UUID and AUTO_INCREMENT
Pros:
表,数据库甚至服务器之间的UUID值是唯一的,它们允许您合并来自不同数据库的行或跨服务器分发数据库。 UUID值不会公开有关数据的信息,因此在URL中使用它们更安全。 可以在避免往返数据库服务器的任何地方生成UUID值。例如,要将数据插入父表和子表,您必须先插入父表,获取生成的ID,然后将数据插入子表。通过使用UUID,您可以预先生成父表的主键值,并在事务中同时将行插入父表和子表。
Cons
存储UUID值(16字节)比整数(4字节)或甚至大整数(8字节)占用更多存储空间。 调试似乎更难,想象一下表达WHERE id = 'df3b7cb7-6a95-11e7-8846-b05adad3f0ae' 而不是WHERE id = 10 使用UUID值可能会导致性能问题,因为它们的大小和没有被排序。 use uuid_binary
Tips to enhance efficiency
some tips
authority management
create a new user: CREATE USER user_account IDENTIFIED BY password;
show the grant: SHOW GRANTS FOR user_account;
grant the authority: GRANT ...(all) ON database.* TO user_account;
revoke the authority: REVOKE ... ON database.* TO user_account;
Trigger
take effect when using DELETE、INSERT、UPDATE
The difference with Assertion: may trigger one when adding a table, which reduces the efficiency.
Trigger can change the data, but Assertion can’t
Function
Common Table Expressions CTE
with cte_name as...
dual
as a dummy table name in situations where no tables are referenced
Transaction
这种把多条语句作为一个整体进行操作,确保该事务范围内的所有操作都可以全部成功或者全部失败。如果事务失败,那么效果就和没有执行这些SQL一样,不会对数据库数据有任何改动。
ACID: Atomicity, Consistency, Isolation, Durability
BEGIN; .... COMMIT;//提交 ROLLBACK;//回滚 SAVEPOINT;//事务可以回滚到 savepoint 而不影响 savepoint 创建前的变化, 不需要放弃整个事务。