pengwk / blog

用 Issue 的方式写博客
0 stars 0 forks source link

PostgreSQL #7

Open pengwk opened 5 years ago

pengwk commented 5 years ago

记录所有查询语句

macOS

pengwk commented 5 years ago

查看数据库隔离级别

psql (9.6.12, server 9.6.11)
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off)
Type "help" for help.

sso=> show default_transaction_isolation;
 default_transaction_isolation
-------------------------------
 read committed
(1 row)
pengwk commented 5 years ago

PG 运维

平时运维

运行 analyze

PG 会收集每张表的统计数据,作为查询计划的参考。可以手动运行,收集一次,让 PG 做出正确的查询的计划。 可以为一个数据库,一张表,一个字段执行这个。

vacuum

从哪些指标可以知道需要这个命令?什么时候需要运行?运行会花多久?会产生锁吗?有多少性能提升?

加索引

一个索引的成本?

查看一个索引占用的内存:

select pg_size_pretty(pg_relation_size('demo_index'));

http://postgresguide.com/tips/disk-usage.html

加索引的方式

同步:有写锁 create index demo_idx on demo_table using btree(created_at);

异步:无写锁,时间会增加 2~3 倍 create index concurrently on demo_table using btree(created_at);

分析日志

pgbadger demo: http://pgbadger.darold.net/samplev7.html#pgbsimultaneous-sessions

升级

升级需要注意什么?会有 DownTime 吗?有的话会多久?为什么会有不可用?以 PG9.6 升级 PG10 为例

故障时

常见情景

可以做什么?

https://tableplus.io/blog/2018/08/postgresql-how-to-find-and-kill-hanging-query.html

pengwk commented 5 years ago

Basic Guide

http://postgresguide.com/ https://explain.depesz.com/history https://www.slideshare.net/pgconf/the-postgresql-query-planner https://www.postgresql.org/docs/current/using-explain.html

image

pengwk commented 5 years ago

PG 培训记录

名词

Cardinality_(SQL_statements) Why low cardinality indexes negatively impact performance 索引选择性与前缀索引 SQL 与关系代数 MySQL索引背后的数据结构及算法原理 MySQL索引与Index Condition Pushdown

A table has rows and columns, where rows represents records and columns represent the attributes. Tuple − A single row of a table, which contains a single record for that relation is called a tuple. Relation instance − A finite set of tuples in the relational database system represents relation instance.

https://stackoverflow.com/questions/49733675/what-does-loop-in-explain-analyze-statement-mean http://zbo.space/2016/08/13/join/

http://tatiyants.com/pev/#/plans

https://explain.depesz.com/

表设计

索引

Latency numbers every programmer should know

L1 cache reference ......................... 0.5 ns
Branch mispredict ............................ 5 ns
L2 cache reference ........................... 7 ns
Mutex lock/unlock ........................... 25 ns
Main memory reference ...................... 100 ns             
Compress 1K bytes with Zippy ............. 3,000 ns  =   3 µs
Send 2K bytes over 1 Gbps network ....... 20,000 ns  =  20 µs
SSD random read ........................ 150,000 ns  = 150 µs
Read 1 MB sequentially from memory ..... 250,000 ns  = 250 µs
Round trip within same datacenter ...... 500,000 ns  = 0.5 ms
Read 1 MB sequentially from SSD* ..... 1,000,000 ns  =   1 ms
Disk seek ........................... 10,000,000 ns  =  10 ms
Read 1 MB sequentially from disk .... 20,000,000 ns  =  20 ms
Send packet CA->Netherlands->CA .... 150,000,000 ns  = 150 ms

Assuming ~1GB/sec SSD

Visual representation of latencies

Visual chart provided by ayshen

Data by Jeff Dean

Originally by Peter Norvig

问题分析

日志分析工具:pgbadger

可以分析出什么?

问题出现哪里? 可以看到什么?

https://severalnines.com/database-blog/postgresql-log-analysis-pgbadger

Overview

Connections

和 session 的区别是什么?

https://www.experts-exchange.com/questions/29117015/With-PostgreSQL-what-is-the-difference-between-a-session-and-a-connection.html

Sessions

https://stackoverflow.com/questions/43241958/what-is-a-postgres-session

Checkpoints

Temp Files

Vacuums

Locks

Queries

Top

Event

参数调整

log_min_duration_statement 设置执行超过多长时间的 SQL 写入日志

AWS > RDS > Parameter Groups 中修改这个参数,保存立即生效,不会导致数据库重启(options group 会)。

image

线上问题处理

参考

pengwk commented 5 years ago

Join 的实现

三种基本方式:

参考

https://www.csd.uoc.gr/~hy460/pdf/p63-mishra.pdf

pengwk commented 5 years ago

Index Type

btree

hash

GIN

GiST

bitmap index

https://dba.stackexchange.com/questions/119386/understanding-bitmap-heap-scan-and-bitmap-index-scan https://stackoverflow.com/questions/33100637/understanding-bitmap-indexes-in-postgresql https://rajeevrastogi.blogspot.com/2018/02/bitmap-scan-in-postgresql.html?showComment=1518410565792#c4647352762092142586 https://www.postgresql.org/message-id/12553.1135634231@sss.pgh.pa.us

https://en.wikipedia.org/wiki/Bitmap_index

https://www.youtube.com/watch?v=sMbQW7XNUZs

https://www.geeksforgeeks.org/bitmap-indexing-in-dbms/

倒排索引 Inverted index

https://zh.wikipedia.org/wiki/%E5%80%92%E6%8E%92%E7%B4%A2%E5%BC%95

https://github.com/digoal/blog/blob/master/201706/20170627_01.md?spm=a2c4e.10696291.0.0.465a19a4Z9K7tu&file=20170627_01.md

pengwk commented 5 years ago

order by

order by 和索引有什么关系?

https://www.cybertec-postgresql.com/en/postgresql-improving-sort-performance/# https://stackoverflow.com/questions/27121444/postgresql-order-by-choosing-right-index http://www.postgres.cn/docs/9.4/indexes-ordering.html https://www.postgresql.org/docs/9.6/indexes-ordering.html

pengwk commented 5 years ago

参数设置与查看

show 用于查看 set 用于设置

pengwk commented 5 years ago

范式 Normal Formal

初步理解:范式有5种级别,高的包含低的。范式越高冗余数据越少,写性能高。但是实际应用时读更多,需要增加冗余数据避免 Join。

冗余和一致性

pengwk commented 4 years ago

explain 可视化

火焰图🔥

项目地址:https://github.com/mgartner/pg_flame

将 sql 保存在文件中

example.sql

explain (analyze, verbose, buffers, format json) SELECT xxx

读取文件中的 SQL 语句,不排版,不输出欢迎信息、页脚,并将结果保存在 example.json 中。

psql -qAt -f example.sql > example.json

cat example.json | ./pg_flame > flamegraph.html

-A
--no-align
    Switches to unaligned output mode. (The default output mode is otherwise aligned.) This is equivalent to \pset format unaligned.

-f filename
--file=filename
    Read commands from the file filename, rather than standard input. This option can be repeated and combined in any order with the -c option. When
    either -c or -f is specified, psql does not read commands from standard input; instead it terminates after processing all the -c and -f options in
    sequence. Except for that, this option is largely equivalent to the meta-command \i.

    If filename is - (hyphen), then standard input is read until an EOF indication or \q meta-command. This can be used to intersperse interactive input
    with input from files. Note however that Readline is not used in this case (much as if -n had been specified).

    Using this option is subtly different from writing psql < filename. In general, both will do what you expect, but using -f enables some nice features
    such as error messages with line numbers. There is also a slight chance that using this option will reduce the start-up overhead. On the other hand,
    the variant using the shell's input redirection is (in theory) guaranteed to yield exactly the same output you would have received had you entered
    everything by hand.

-q
--quiet
    Specifies that psql should do its work quietly. By default, it prints welcome messages and various informational output. If this option is used, none
    of this happens. This is useful with the -c option. This is equivalent to setting the variable QUIET to on.

-t
--tuples-only
    Turn off printing of column names and result row count footers, etc. This is equivalent to \t or \pset tuples_only.
pengwk commented 4 years ago

https://pgexercises.com/

pengwk commented 4 years ago

https://gist.github.com/Kartones/dd3ff5ec5ea238d4c546

pengwk commented 4 years ago

https://severalnines.com/database-blog/performance-cheat-sheet-postgresql

pengwk commented 4 years ago

http://coding-geek.com/how-databases-work/ https://www.reddit.com/r/Database/comments/27u6dy/how_do_you_build_a_database/ciggal8

https://docs.oracle.com/database/121/TGSQL/tgsql_optop.htm#TGSQL95355

lpdswing commented 3 years ago

PG 运维

  • 平时运维
  • 故障时

平时运维

  • 运行 analyze
  • vacuum
  • 加索引
  • 查看慢查询
  • 分析慢查询
  • 分析日志

运行 analyze

PG 会收集每张表的统计数据,作为查询计划的参考。可以手动运行,收集一次,让 PG 做出正确的查询的计划。 可以为一个数据库,一张表,一个字段执行这个。

vacuum

从哪些指标可以知道需要这个命令?什么时候需要运行?运行会花多久?会产生锁吗?有多少性能提升?

加索引

一个索引的成本?

查看一个索引占用的内存:

select pg_size_pretty(pg_relation_size('demo_index'));

http://postgresguide.com/tips/disk-usage.html

加索引的方式

同步:有写锁 create index demo_idx on demo_table using btree(created_at);

异步:无写锁,时间会增加 2~3 倍 create index concurrently on demo_table using btree(created_at);

分析日志

pgbadger demo: http://pgbadger.darold.net/samplev7.html#pgbsimultaneous-sessions

升级

升级需要注意什么?会有 DownTime 吗?有的话会多久?为什么会有不可用?以 PG9.6 升级 PG10 为例

故障时

常见情景

可以做什么?

  • 查看正在运行的 SQL,Kill 掉。 image

https://tableplus.io/blog/2018/08/postgresql-how-to-find-and-kill-hanging-query.html

  • CPU 100%
  • 死锁

怎么查看慢查询呢

pengwk commented 3 years ago

批量删除数据

count=0
while true
do
        echo $count
        PGPASSWORD=password psql -h xxxxx.rds.cn-north-1.amazonaws.com.cn -p 5432 -U user db_name -c "delete from table where id in (select id from table  limit 100000);"
        ((count=count+1))
        sleep 5s
done
pengwk commented 3 years ago

查看锁的情况

select
    client_addr, client_hostname, client_port, query_start, wait_event_type, query
from
    pg_stat_activity
where
        pid in (
        select pid from pg_locks l
                            join pg_class t on l.relation = t.oid
            and t.relkind = 'r'
        where t.relname = 'table'
    ) and query_start < '2020-12-09 04:00:00'
order by query_start asc;
pengwk commented 2 years ago

AWS 的教程 https://rdspg.workshop.aws/2-foundation/lab5-upgrade.html

pengwk commented 2 years ago

https://github.com/fpietka/rds-pgbadger 分析 AWS RDS 数据库的日志。