Open ppcololo opened 8 months ago
Did you try setting TORK_COORDINATOR_QUEUES_LOGS
to a value greater than 1 (default) to have multiple subscribers processing the logs
queue?
Thanks for the pointing to this.
As I can see - https://github.com/runabol/tork/blob/main/configs/sample.config.toml#L39-L45
Here some values. But could you share the link to documentation which describes in more details what that values mean?
If I set logs=x
what does it mean?
It means number of subscribers/goroutines that will process the logs
queue in parallel.
Thanks @runabol It helped a lot. Please add more info to the documentation - you will avoid a lof of questions in the future
That's fair
I have to reopen this: We really need option to disable logging. Take a look - we set option in the logs:
[datastore.postgres]
dsn = ""
task.logs.interval = "168h"
It means - logs retention 7d Today is 03.05.2024 But in DB I can see:
tork=# select min(created_at) from tasks_log_parts limit 1;
min
----------------------------
2024-04-22 12:38:07.650203
(1 row)
And our DB grows indefenetly:
tork=# select
table_name,
pg_size_pretty(pg_total_relation_size(quote_ident(table_name))),
pg_total_relation_size(quote_ident(table_name))
from information_schema.tables
where table_schema = 'public'
order by 3 desc;
table_name | pg_size_pretty | pg_total_relation_size
-----------------+----------------+------------------------
tasks_log_parts | 137 GB | 146990120960
tasks | 221 MB | 231546880
jobs | 5168 kB | 5292032
nodes | 1744 kB | 1785856
(4 rows)
So I can say - option in config doesnt work OR work really slow and can't delete all new logs. We want to disable logging completely and use another software for this like ELK stack
If you check my message above - this option doesn't work Today I've checked logs in DB and I see
tork=# select min(created_at) from tasks_log_parts;
min
----------------------------
2024-04-22 12:42:13.105112
(1 row)
and
tork=# select
table_name,
pg_size_pretty(pg_total_relation_size(quote_ident(table_name))),
pg_total_relation_size(quote_ident(table_name))
from information_schema.tables
where table_schema = 'public'
order by 3 desc;
table_name | pg_size_pretty | pg_total_relation_size
-----------------+----------------+------------------------
tasks_log_parts | 162 GB | 173817077760
tasks | 221 MB | 231948288
jobs | 5312 kB | 5439488
nodes | 1832 kB | 1875968
(4 rows)
+ 25GB
of logs from yesterday
As you can see tork
deleted 6 mins of logs
was: 2024-04-22 12:38:07.650203
now: 2024-04-22 12:42:13.105112
Sounds like the pruning process is not catching up quickly enough with the amount of logs you're generating per day. I can make the number of records it deletes per cleaning period configurable. Right now it's hard-coded to 1000 I believe.
If I'm not mistaken - we have about 20 millions of rows per day in DB
Can you try release 0.1.73? It adds improvements to log shipping -- buffering log messages (up to one second) rather than sending each log line separately.
For now as I can understand we have such flow:
This makes some problems - if we have a lot of workers and logs I can see more than x millions mesages in the queue logs. As I can see coordinator doesn't pull and save logs in DB in proper time and that means if I press button Logs it shows nothing. And from time to time I have to purge this queue to see logs
Possible options: