destrex271 / pgwatch3_rpc_server

RPC Server Implementation for pgwatch3
BSD 3-Clause "New" or "Revised" License
3 stars 0 forks source link

TinyLlama Processing Sink #33

Closed destrex271 closed 1 month ago

destrex271 commented 2 months ago

Uses Llama3 model(from ollama) to derive insights on metrics for a given database and stores the results in postgres

References:

coveralls commented 2 months ago

Pull Request Test Coverage Report for Build 11068473163

Details


Changes Missing Coverage Covered Lines Changed/Added Lines %
cmd/llama_receiver/main.go 0 7 0.0%
cmd/llama_receiver/llama_receiver.go 204 286 71.33%
<!-- Total: 204 293 69.62% -->
Totals Coverage Status
Change from base Build 11054500756: 8.1%
Covered Lines: 433
Relevant Lines: 750

💛 - Coveralls
destrex271 commented 2 months ago

Model Updated to tinyllama for ease of user

pashagolub commented 2 months ago

Looks cool! Do you have any demo output? I'm really eager to see how it works :)

destrex271 commented 2 months ago

Yep this is how we get the out put for some dummy metrics:

Metrics(We are picking the 10 most recent measurements for now so that my laptop does not blow up :)):

       created_time        |                        data                         | metric_name | database_id | id | dbname 
----------------------------+-----------------------------------------------------+-------------+-------------+----+--------
 2024-09-18 15:09:17.744536 | [{"cpu": 0.4931227158526041, "checkpointer": 8}]    | health      |           1 |  1 | test
 2024-09-17 05:57:23.411127 | [{"cpu": 0.9391787279884108, "checkpointer": 21}]   | health      |           1 |  1 | test
 2024-09-17 05:56:55.397735 | [{"cpu": 0.9391787279884108, "checkpointer": 21}]   | health      |           1 |  1 | test
 2024-09-17 05:56:23.33855  | [{"cpu": 0.9391787279884108, "checkpointer": 21}]   | health      |           1 |  1 | test
 2024-09-17 05:55:06.514503 | [{"cpu": 0.9391787279884108, "checkpointer": 21}]   | health      |           1 |  1 | test
 2024-09-17 05:54:19.861201 | [{"cpu": 0.9391787279884108, "checkpointer": 21}]   | health      |           1 |  1 | test
 2024-09-16 10:21:46.461051 | [{"cpu": 0.17688077186140289, "checkpointer": 99}]  | health      |           1 |  1 | test
 2024-09-16 10:20:44.369418 | [{"cpu": 0.015728886965353912, "checkpointer": 22}] | health      |           1 |  1 | test
 2024-09-16 07:33:04.968043 | [{"cpu": 0.1026947434019184, "checkpointer": 0}]    | health      |           1 |  1 | test
 2024-09-16 07:31:22.365576 | [{"cpu": 0.8076059523961426, "checkpointer": 40}]   | health      |           1 |  1 | test
 2024-09-16 07:29:27.054481 | [{"cpu": 0.034741206423261346, "checkpointer": 73}] | health      |           1 |  1 | test
 2024-09-16 07:28:28.229516 | [{"cpu": 0.9956509471402158, "checkpointer": 26}]   | health      |           1 |  1 | test
 2024-09-16 07:27:24.502777 | [{"cpu": 0.8833158321966696, "checkpointer": 40}]   | health      |           1 |  1 | test
 2024-09-16 07:26:39.713865 | [{"cpu": 0.001640287333294862, "checkpointer": 0}]  | health      |           1 |  1 | test
 2024-09-16 07:25:40.452153 | [{"cpu": 0.8729865490332787, "checkpointer": 12}]   | health      |           1 |  1 | test
 2024-09-16 07:24:37.738921 | [{"cpu": 0.8742443008720868, "checkpointer": 62}]   | health      |           1 |  1 | test
 2024-09-16 07:17:10.97184  | [{"cpu": 0.22953988467236924, "checkpointer": 38}]  | health      |           1 |  1 | test
 2024-09-16 07:15:01.259699 | [{"cpu": 1.169480258119128, "checkpointer": 81}]    | health      |           1 |  1 | test
 2024-09-16 07:14:20.124188 | [{"cpu": 1.0811733871405975, "checkpointer": 65}]   | health      |           1 |  1 | test
 2024-09-16 07:13:19.391432 | [{"cpu": 1.3516906697077604, "checkpointer": 38}]   | health      |           1 |  1 | test
 2024-09-16 07:12:55.786994 | [{"cpu": 1.5481504692957349, "checkpointer": 26}]   | health      |           1 |  1 | test

Model Output:


Introduction:

In this report, we will analyze the health metrics of a single PostgreSQL database. These metrics, which include CPU usage, checkpointer frequency, and number of open files, are crucial indicators of database performance that can help identify potential issues before they become significant problems. By providing insight into these metrics, we hope to provide recommendations for maintaining optimal database performance.

Metric Name: Health:

1. CPU Usage (0.4931227158526041)
   Metric: CPU usage is the percentage of time that the database server is running on a CPU. A high CPU usage can indicate that the database is being overloaded, leading to slower performance and increased memory usage. To mitigate this issue, we recommend tuning the database configuration settings, such as increasing the number of threads or reducing the maximum connections allowed by the database server.
   Insight: This metric indicates that there are likely more than two CPUs being used, indicating that the database may be experiencing a high workload. To mitigate this issue, it may be beneficial to adjust the number of threads in the database server configuration and/or increase the maximum connections allowed by the server.

2. Checkpointer Frequency (8)
   Metric: The frequency with which the PostgreSQL database checkpointer is run. A low checkpointer frequency indicates that there may be issues with the database or storage, such as data corruption or inefficient indexes. To address this issue, we recommend regularly updating and tuning the database's storage engine, particularly for large datasets or if there are known performance issues associated with in-memory databases.
   Insight: This metric indicates that the database is performing poorly, which may be due to data corruption or an inefficient index configuration. To mitigate this issue, we recommend regularly updating and tuning the database's storage engine.

3. Number of Open Files (21)
   Metric: The number of file descriptors used by the PostgreSQL database server. A high number of open files can indicate that the database is experiencing a memory leak or other issues with the file system, which may lead to slower performance and increased disk I/O. To address this issue, we recommend regularly monitoring and tuning the database's storage and file system configuration settings.
   Insight: This metric indicates that there are likely more than two open files being used by the database server, indicating a memory leak or other issues with the file system. To mitigate this issue, it may be beneficial to regularly monitor and tune the database's storage and file system configuration settings.

Conclusion:

In conclusion, we have analyzed the health metrics of a single PostgreSQL database. We have observed that CPU usage is high, checkpointer frequency is low, and number of open files are high. To address these issues, we recommend regularly monitoring and tuning the database's storage and file system configuration settings, particularly for large datasets or if there are known performance issues associated with in-memory databases. By following these recommendations, we hope to provide optimal database performance for our target application

Need to adjust the prompt a little so that we don't get weird outputs.

destrex271 commented 2 months ago

This is one more example of what is being generated with the demo database in pgwatch3:

The given data indicates that the PostgreSQL server is experiencing high write activity, with the majority of these writes being for index updates and inserts. The time taken to perform a read operation has increased significantly since the beginning of the month, indicating a possible bottleneck in the system. Average wait times for select queries have also increased, suggesting that query planning is becoming more difficult as well.

The data shows that the majority of writes are being done by indices, with 74% of write operations involving index updates and 12% involving inserts. This indicates a high degree of reliance on indexes, particularly in relation to updates.

Analyzing the time taken for read operations, it appears that some queries may be taking longer than they did previously due to increased wait times caused by bottlenecks in the system. Additionally, the number of reads occurring within the database has also increased significantly, with 35% of all reads being performed during the specified period. This suggests a possible issue with the underlying storage system or database design.

Regarding query planning, there appears to be a higher degree of difficulty in this area compared to previous months. The average wait time for select queries has increased by 20%, suggesting that query optimization is becoming more challenging as well. The data also shows that some queries are being performed without a physical table or index being used for the table. This indicates a potential issue with the query planning process and suggests that further investigation may be necessary to improve query performance.

Overall, these observations provide insights into the current state of the PostgreSQL server, highlighting areas for improvement as well as identifying potential bottlenecks or issues in the system. Further analysis and investigation are necessary to understand and resolve any issues that may be present.

although it seems to be a little bad with the grammar

edit: added output over larger number of metrics

pashagolub commented 2 months ago

Looks really cool! We need to make it less chatty, IMHO, otherwise really cool!

destrex271 commented 2 months ago

Looks really cool! We need to make it less chatty, IMHO, otherwise really cool!

Sure! will do that

destrex271 commented 2 months ago

@pashagolub I think this is it for this PR. Would love to get your reviews.

destrex271 commented 1 month ago

@pashagolub fixed the SQL formatting

destrex271 commented 1 month ago

Merging!