pengwk commented 5 years ago

记录所有查询语句

macOS

pengwk commented 5 years ago

查看数据库隔离级别

psql (9.6.12, server 9.6.11)
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off)
Type "help" for help.

sso=> show default_transaction_isolation;
 default_transaction_isolation
-------------------------------
 read committed
(1 row)

pengwk commented 5 years ago

PG 运维

平时运维
故障时

平时运维

运行 analyze
vacuum
加索引
查看慢查询
分析慢查询
分析日志

运行 analyze

PG 会收集每张表的统计数据，作为查询计划的参考。可以手动运行，收集一次，让 PG 做出正确的查询的计划。可以为一个数据库，一张表，一个字段执行这个。

vacuum

从哪些指标可以知道需要这个命令？什么时候需要运行？运行会花多久？会产生锁吗？有多少性能提升？

加索引

一个索引的成本？

查看一个索引占用的内存：

select pg_size_pretty(pg_relation_size('demo_index'));

http://postgresguide.com/tips/disk-usage.html

加索引的方式

同步：有写锁 create index demo_idx on demo_table using btree(created_at);

异步：无写锁，时间会增加 2~3 倍 create index concurrently on demo_table using btree(created_at);

分析日志

pgbadger demo: http://pgbadger.darold.net/samplev7.html#pgbsimultaneous-sessions

升级

升级需要注意什么？会有 DownTime 吗？有的话会多久？为什么会有不可用？以 PG9.6 升级 PG10 为例

故障时

常见情景

可以做什么？

查看正在运行的 SQL，Kill 掉。

https://tableplus.io/blog/2018/08/postgresql-how-to-find-and-kill-hanging-query.html

CPU 100%
死锁

pengwk commented 5 years ago

Basic Guide

http://postgresguide.com/ https://explain.depesz.com/history https://www.slideshare.net/pgconf/the-postgresql-query-planner https://www.postgresql.org/docs/current/using-explain.html

pengwk commented 5 years ago

PG 培训记录

名词

checkpoint
Cardinality

Cardinality_(SQL_statements) Why low cardinality indexes negatively impact performance 索引选择性与前缀索引 SQL 与关系代数 MySQL索引背后的数据结构及算法原理 MySQL索引与Index Condition Pushdown

loop
tuple 是什么？

A table has rows and columns, where rows represents records and columns represent the attributes. Tuple − A single row of a table, which contains a single record for that relation is called a tuple. Relation instance − A finite set of tuples in the relational database system represents relation instance.

https://stackoverflow.com/questions/49733675/what-does-loop-in-explain-analyze-statement-mean http://zbo.space/2016/08/13/join/

http://tatiyants.com/pev/#/plans

https://explain.depesz.com/

表设计

索引

Latency numbers every programmer should know

L1 cache reference ......................... 0.5 ns
Branch mispredict ............................ 5 ns
L2 cache reference ........................... 7 ns
Mutex lock/unlock ........................... 25 ns
Main memory reference ...................... 100 ns             
Compress 1K bytes with Zippy ............. 3,000 ns  =   3 µs
Send 2K bytes over 1 Gbps network ....... 20,000 ns  =  20 µs
SSD random read ........................ 150,000 ns  = 150 µs
Read 1 MB sequentially from memory ..... 250,000 ns  = 250 µs
Round trip within same datacenter ...... 500,000 ns  = 0.5 ms
Read 1 MB sequentially from SSD* ..... 1,000,000 ns  =   1 ms
Disk seek ........................... 10,000,000 ns  =  10 ms
Read 1 MB sequentially from disk .... 20,000,000 ns  =  20 ms
Send packet CA->Netherlands->CA .... 150,000,000 ns  = 150 ms

Assuming ~1GB/sec SSD

Visual representation of latencies

Visual chart provided by ayshen

Data by Jeff Dean

Originally by Peter Norvig

问题分析

日志分析工具：pgbadger

可以分析出什么？

问题出现哪里？可以看到什么？

https://severalnines.com/database-blog/postgresql-log-analysis-pgbadger

Overview

Global Stats
SQL Traffic
Select Traffic
Write Traffic
Queries Traffic
Prepared queries ratio
General Activity

Connections

和 session 的区别是什么？

https://www.experts-exchange.com/questions/29117015/With-PostgreSQL-what-is-the-difference-between-a-session-and-a-connection.html

Established connections
Connections per database
Connections per user
Connections per host

Sessions

https://stackoverflow.com/questions/43241958/what-is-a-postgres-session

Simultaneous sessions
Histogram of sessions times
Sessions per database
Sessions per user
Sessions per host
Sessions per application

Checkpoints

Checkpoints buffers
Checkpoints files
Checkpoints distance
Checkpoints activity

Temp Files

Size of temporary files
Number of temporary files
Temporary files activity
Queries generating the most files(N)
Queries generating the largest files

Vacuums

Vacuums distribution
Vacuums activity
Analyzes per Tables
Vacuums per Tables
Tuples Removed
Page removed

Locks

Lock by type
Most frequent waiting queries(N)
Queries that waited the most

Queries

Queries by type
Queries by database
Queries by user
Duration by user
Queries by host
Queries by application
Number of cancelled queries

Top

Histogram of query times
Slowest individual queries
Time Consuming queries(N)
Most frequent queries(N)
Normalized slowest queries

Event

Log levels
Event distribution
Most frequent errors/events

参数调整

log_min_duration_statement 设置执行超过多长时间的 SQL 写入日志

AWS > RDS > Parameter Groups 中修改这个参数，保存立即生效，不会导致数据库重启（options group 会）。

线上问题处理

参考

pengwk commented 5 years ago

Join 的实现

三种基本方式：

Hash Join
Nest Loop Join
Sort Merge Join

参考

https://www.csd.uoc.gr/~hy460/pdf/p63-mishra.pdf

pengwk commented 5 years ago

Index Type

btree

hash

GIN

GiST

bitmap index

https://dba.stackexchange.com/questions/119386/understanding-bitmap-heap-scan-and-bitmap-index-scan https://stackoverflow.com/questions/33100637/understanding-bitmap-indexes-in-postgresql https://rajeevrastogi.blogspot.com/2018/02/bitmap-scan-in-postgresql.html?showComment=1518410565792#c4647352762092142586 https://www.postgresql.org/message-id/12553.1135634231@sss.pgh.pa.us

https://en.wikipedia.org/wiki/Bitmap_index

https://www.youtube.com/watch?v=sMbQW7XNUZs

https://www.geeksforgeeks.org/bitmap-indexing-in-dbms/

倒排索引 Inverted index

https://zh.wikipedia.org/wiki/%E5%80%92%E6%8E%92%E7%B4%A2%E5%BC%95

https://github.com/digoal/blog/blob/master/201706/20170627_01.md?spm=a2c4e.10696291.0.0.465a19a4Z9K7tu&file=20170627_01.md

pengwk commented 5 years ago

order by

order by 和索引有什么关系？

https://www.cybertec-postgresql.com/en/postgresql-improving-sort-performance/# https://stackoverflow.com/questions/27121444/postgresql-order-by-choosing-right-index http://www.postgres.cn/docs/9.4/indexes-ordering.html https://www.postgresql.org/docs/9.6/indexes-ordering.html

pengwk commented 5 years ago

参数设置与查看

show 用于查看 set 用于设置

pengwk commented 5 years ago

范式 Normal Formal

范式是什么？
为什么要反范式？
范式在什么时候会出问题？
有哪些常见的反范式设计？

初步理解：范式有5种级别，高的包含低的。范式越高冗余数据越少，写性能高。但是实际应用时读更多，需要增加冗余数据避免 Join。

冗余和一致性

pengwk commented 4 years ago

explain 可视化

火焰图🔥

项目地址：https://github.com/mgartner/pg_flame

将 sql 保存在文件中

example.sql

explain (analyze, verbose, buffers, format json) SELECT xxx

读取文件中的 SQL 语句，不排版，不输出欢迎信息、页脚，并将结果保存在 example.json 中。

psql -qAt -f example.sql > example.json

cat example.json | ./pg_flame > flamegraph.html

-A
--no-align
    Switches to unaligned output mode. (The default output mode is otherwise aligned.) This is equivalent to \pset format unaligned.

-f filename
--file=filename
    Read commands from the file filename, rather than standard input. This option can be repeated and combined in any order with the -c option. When
    either -c or -f is specified, psql does not read commands from standard input; instead it terminates after processing all the -c and -f options in
    sequence. Except for that, this option is largely equivalent to the meta-command \i.

    If filename is - (hyphen), then standard input is read until an EOF indication or \q meta-command. This can be used to intersperse interactive input
    with input from files. Note however that Readline is not used in this case (much as if -n had been specified).

    Using this option is subtly different from writing psql < filename. In general, both will do what you expect, but using -f enables some nice features
    such as error messages with line numbers. There is also a slight chance that using this option will reduce the start-up overhead. On the other hand,
    the variant using the shell's input redirection is (in theory) guaranteed to yield exactly the same output you would have received had you entered
    everything by hand.

-q
--quiet
    Specifies that psql should do its work quietly. By default, it prints welcome messages and various informational output. If this option is used, none
    of this happens. This is useful with the -c option. This is equivalent to setting the variable QUIET to on.

-t
--tuples-only
    Turn off printing of column names and result row count footers, etc. This is equivalent to \t or \pset tuples_only.

pengwk commented 4 years ago

https://pgexercises.com/

pengwk commented 4 years ago

https://gist.github.com/Kartones/dd3ff5ec5ea238d4c546

pengwk commented 4 years ago

https://severalnines.com/database-blog/performance-cheat-sheet-postgresql

pengwk commented 4 years ago

http://coding-geek.com/how-databases-work/ https://www.reddit.com/r/Database/comments/27u6dy/how_do_you_build_a_database/ciggal8

https://docs.oracle.com/database/121/TGSQL/tgsql_optop.htm#TGSQL95355

lpdswing commented 3 years ago

PG 运维

平时运维

故障时

平时运维

运行 analyze

vacuum

加索引

查看慢查询

分析慢查询

分析日志

运行 analyze

PG 会收集每张表的统计数据，作为查询计划的参考。可以手动运行，收集一次，让 PG 做出正确的查询的计划。可以为一个数据库，一张表，一个字段执行这个。

vacuum

从哪些指标可以知道需要这个命令？什么时候需要运行？运行会花多久？会产生锁吗？有多少性能提升？

加索引

一个索引的成本？

查看一个索引占用的内存：

select pg_size_pretty(pg_relation_size('demo_index'));

http://postgresguide.com/tips/disk-usage.html

加索引的方式

同步：有写锁 create index demo_idx on demo_table using btree(created_at);

异步：无写锁，时间会增加 2~3 倍 create index concurrently on demo_table using btree(created_at);

分析日志

pgbadger demo: http://pgbadger.darold.net/samplev7.html#pgbsimultaneous-sessions

升级

升级需要注意什么？会有 DownTime 吗？有的话会多久？为什么会有不可用？以 PG9.6 升级 PG10 为例

故障时

常见情景

可以做什么？

查看正在运行的 SQL，Kill 掉。

https://tableplus.io/blog/2018/08/postgresql-how-to-find-and-kill-hanging-query.html

CPU 100%

死锁

怎么查看慢查询呢

pengwk commented 3 years ago

批量删除数据

count=0
while true
do
        echo $count
        PGPASSWORD=password psql -h xxxxx.rds.cn-north-1.amazonaws.com.cn -p 5432 -U user db_name -c "delete from table where id in (select id from table  limit 100000);"
        ((count=count+1))
        sleep 5s
done

pengwk commented 3 years ago

查看锁的情况

select
    client_addr, client_hostname, client_port, query_start, wait_event_type, query
from
    pg_stat_activity
where
        pid in (
        select pid from pg_locks l
                            join pg_class t on l.relation = t.oid
            and t.relkind = 'r'
        where t.relname = 'table'
    ) and query_start < '2020-12-09 04:00:00'
order by query_start asc;

pengwk commented 2 years ago

AWS 的教程 https://rdspg.workshop.aws/2-foundation/lab5-upgrade.html

pengwk commented 2 years ago

https://github.com/fpietka/rds-pgbadger 分析 AWS RDS 数据库的日志。

pengwk / blog

PostgreSQL #7

记录所有查询语句

macOS

查看数据库隔离级别

PG 运维

平时运维

运行 analyze

vacuum

加索引

加索引的方式

分析日志

升级

故障时

常见情景

Basic Guide

PG 培训记录

名词

表设计

索引

Latency numbers every programmer should know

问题分析

日志分析工具：pgbadger

Overview

Connections

Sessions

Checkpoints

Temp Files

Vacuums

Locks

Queries

Top

Event

参数调整

log_min_duration_statement 设置执行超过多长时间的 SQL 写入日志

线上问题处理

参考

Join 的实现

参考

Index Type

btree

hash

GIN

GiST

bitmap index

倒排索引 Inverted index

order by

参数设置与查看

范式 Normal Formal

explain 可视化

火焰图🔥

PG 运维

平时运维

运行 analyze

vacuum

加索引

加索引的方式

分析日志

升级

故障时

常见情景

批量删除数据

查看锁的情况