Open skilyazhnev opened 1 year ago
Thanks for detailed report! @reshke can you plz take a look too? As far as I know Odysey does not mmap\munmap on it's own... Looks very much like валенок на пульте. Somewhere within TLS interaction. Also we support for boring SSL, maybe they are better in that case?
@x4m Thx for the answer!
About using SSL instead, this is part of my question too, I don't know how (docs and code didn't help me :( ) to force change parameter tls_protocol
(for downgrade)
There's a setting tls_protocols but it seems to be not attached to real TLS creation See https://github.com/yandex/odyssey/blob/master/third_party/machinarium/sources/tls.c#L190
As far as I understand, when we are using TLS we call read in busy loop even when there is no data from network. Flamegraphs could prove or disprove this...
@skilyazhnev sorry for the long delay, now I'm, finally, on it. Did you use COPY or just SELECT?
@x4m only Select (actually we didn't try COPY)
Hi, we are faced with a problem that affects the speed of the query, to be precise query executed fast enough, but the odyssey doubled (or more) the time of transport through itself.
We use:
* Odyssey 1.3
* PostgreSQL 11
* 5.10.0-21-amd64 #1 SMP Debian 5.10.162-1 (2023-01-21) x86_64 GNU/Linux
* GCP hosts - Machine type: e2-medium (2 core 4GB) or e2-standard-4 (4 core and 16Gb)
Our infrastructure looks like ( {} <- same host ):
1) { | app | }<==> { | odyssey | <==> | HAproxy | } <==> { | PostgreSQL | }
2) { | app | } <==> { | pgbouncer | <==> | HAproxy | <==> | PostgreSQL | }
All connections use TLS. (HAproxy only forwarding port to right host)
At the same time, we have a table with BLOBs objects (only 6 rows but in sum, they size 650Mb in textfile).
When we selecting from this table "select * from blob_table;" we receiving these results by time:
And while data go through odyssey's backend CPU usage goes to the top (100% per odyssey backend). Include a lot of Sys and User time.
During data transfer, you can see this (strace -c -p ):
% time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 54.18 6.167812 294 20949 munmap 15.80 1.798835 50 35603 4221 write 12.72 1.448449 19 74861 epoll_wait 12.36 1.406748 16 84471 78 read 3.11 0.354529 16 20945 mmap 0.80 0.091604 59 1541 mprotect 0.39 0.044001 6 6547 epoll_ctl 0.38 0.043215 14 3082 close 0.25 0.028948 6 4686 getpeername ------ ----------- ----------- --------- --------- ---------------- 100.00 11.384141 45 252685 4299 total
(and I checked, calling of munmap is approximately evenly )
A lot of time process just looping this pattern:
solutions which don't speed up:
1) Turn off TLS on Odyssey <==> PostgreSQL connection
2) Turning off THP in linux
3) Increase cache option in Odyssey
4) Increase CPU cores and
And what I can't config:
1) tls_protocol option in Odyssey because docs has no answer on which option I can use.
main question is: Which type of problem it can be?
We considered odyssey as a production solution, but so far this behavior is somewhat confusing
Odyssey config:
odyssey_sysctl.txt