import sys; print('Python %s on %s' % (sys.version, sys.platform))
/root/miniconda3/envs/D-Bot/bin/python /root/.pycharm_helpers/pydev/pydevd.py --multiprocess --qt-support=auto --client localhost --port 37745 --file /home/workspace/YYG/FromS/DB-GPT/main.py
Connected to pydev debugger (build 232.8660.197)
系统启动java的jvm虚拟环境成功
12/14/2023 19:25:09 - ERROR - root - obtain_historical_queries_statistics Fails!
12/14/2023 19:25:09 - WARNING - root - Unused arguments: {'model': 'diag-llama'}
12/14/2023 19:25:09 - WARNING - root - Unused arguments: {'model': 'diag-llama'}
12/14/2023 19:25:09 - WARNING - root - Unused arguments: {'model': 'diag-llama'}
12/14/2023 19:25:09 - WARNING - root - Unused arguments: {'model': 'diag-llama'}
12/14/2023 19:25:09 - WARNING - root - Unused arguments: {'model': 'diag-llama'}
12/14/2023 19:25:09 - WARNING - root - Unused arguments: {'model': 'diag-llama'}
Report Initialization!
0%| | 0/1====================== Initialization ======================
rank : 0
local_rank : 0
world_size : 1
local_size : 1
master : localhost:10010
device : 0
cpus : [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1
3, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 2
4, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 3
5, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 4
6, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 5
7, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 6
8, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 7
9, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 9
0, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100,
101, 102, 103, 104, 105, 106, 107, 108, 109,
110, 111, 112, 113, 114, 115, 116, 117, 118
, 119, 120, 121, 122, 123, 124, 125, 126, 12
7, 128, 129, 130, 131, 132, 133, 134, 135, 1
36, 137, 138, 139, 140, 141, 142, 143, 144,
145, 146, 147, 148, 149, 150, 151, 152, 153,
154, 155, 156, 157, 158, 159, 160, 161, 162
, 163, 164, 165, 166, 167, 168, 169, 170, 17
1, 172, 173, 174, 175]
/root/miniconda3/envs/D-Bot/lib/python3.10/site-packages/bmtrain/synchronize.py:15: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
nccl.allReduce(barrier.storage(), barrier.storage(), 'sum', config['comm'])
args.load is not None, start to load checkpoints /home/workspace/YYG/YYG/D-Bot/DiagLlama/DiagLlama.pt
[INFO][2023-12-14 19:25:39][jeeves-hpc-gpu00][inference.py:33:105510] - load model in 21.73s
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the legacy (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
[INFO][2023-12-14 19:25:40][jeeves-hpc-gpu00][inference.py:38:105510] - load tokenizer in 1.27s
finish loading
100%|██████████████████████████████████████████████████████████████████████| 1/1
Role Assignment!
100%|██████████████████████████████████████████████████████████████████████| 1/1
12/14/2023 19:26:07 - INFO - sentence_transformers.SentenceTransformer - Load pretrained SentenceTransformer: ./localized_llms/sentence_embedding/sentence-transformer/
12/14/2023 19:26:10 - INFO - sentence_transformers.SentenceTransformer - Use pytorch device: cuda
Batches: 100%|████████████████████████████████████| 1/1 [00:00<00:00, 1.46it/s]
CpuExpert Diagnosis!
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Reflecting ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Reflexion: Reflection: From the previous steps, it's clear that CPU usage was indeed abnormal. Upon diagnosing using match_diagnose_knowledge tool, we observed some potential root causes like high disk I/O and increased number of processes running simultaneously. Key indicators such as node_ins_stdload1[ins=] = 1.75 > 100% were identified which point towards an anomaly in the system. Understanding this knowledge allows us to better diagnose and find solutions for similar issues in the future. However, more analysis is needed to pinpoint the exact cause of the high CPU usage. Moving forward, we should delve deeper into analyzing the provided information before jumping to conclusions too quickly.
Voting ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
12/14/2023 19:41:20 - INFO - sentence_transformers.SentenceTransformer - Load pretrained SentenceTransformer: ./localized_llms/sentence_embedding/sentence-transformer/
12/14/2023 19:41:22 - INFO - sentence_transformers.SentenceTransformer - Use pytorch device: cuda
Batches: 100%|████████████████████████████████████| 1/1 [00:00<00:00, 80.18it/s]
MemoryExpert Diagnosis!
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Reflecting ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
Reflexion: Reflection: From the previous attempt, it's clear that sudden surges in memory usage can cause significant performance degradation. The specific query insert into table1 select generate_series... was identified as potentially problematic due to its large scale data insertion within a short timeframe. This type of operations is resource-intensive and could significantly impact system performance. Understanding this knowledge allows us to better diagnose and find solutions for such issues in the future.
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
12/14/2023 20:04:46 - INFO - sentence_transformers.SentenceTransformer - Load pretrained SentenceTransformer: ./localized_llms/sentence_embedding/sentence-transformer/
12/14/2023 20:04:47 - INFO - sentence_transformers.SentenceTransformer - Use pytorch device: cuda
Batches: 100%|████████████████████████████████████| 1/1 [00:00<00:00, 38.65it/s]
MemoryExpert Diagnosis!
Analyzing with tools ...
100%|██████████████████████████████████████████████████████████████████████| 1/1
python-BaseException
Traceback (most recent call last):
File "/root/miniconda3/envs/D-Bot/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/root/miniconda3/envs/D-Bot/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/home/workspace/YYG/FromS/DB-GPT/main.py", line 14, in main
report, records = await multi_agents.run(args)
File "/home/workspace/YYG/FromS/DB-GPT/multiagents/multiagents.py", line 65, in run
report, records = await self.environment.step(args)
File "/home/workspace/YYG/FromS/DB-GPT/multiagents/environments/dba.py", line 252, in step
report = await self.decision_making(selected_experts, None, previous_plan, advice) # plans: the list of diagnosis messages
File "/home/workspace/YYG/FromS/DB-GPT/multiagents/environments/dba.py", line 312, in decision_making
initial_diags = await self.decision_maker.astep(
File "/home/workspace/YYG/FromS/DB-GPT/multiagents/environments/decision_maker/vertical.py", line 33, in astep
results = await asyncio.gather(
File "/home/workspace/YYG/FromS/DB-GPT/multiagents/agents/solver.py", line 200, in step
result_node, top_abnormal_metric_values = chain.start(simulation_count=1,epsilon_new_node=0.3,choice_count=1,vote_candidates=2,vote_count=1,single_chain_max_step=24)
File "/home/workspace/YYG/FromS/DB-GPT/multiagents/reasoning_algorithms/tree_of_thought/UCT_vote_function.py", line 187, in start
end_node, top_abnormal_metric_values = self.default_policy(now_node,this_simulation,single_chain_max_step)
File "/home/workspace/YYG/FromS/DB-GPT/multiagents/reasoning_algorithms/tree_of_thought/UCT_vote_function.py", line 579, in default_policy
result = temp_node.env.tool.call_function(parsed_response.tool, parameters)
File "/home/workspace/YYG/FromS/DB-GPT/multiagents/tools/api_retrieval.py", line 30, in call_function
return func(*args, *kwargs)
File "/home/workspace/YYG/FromS/DB-GPT/multiagents/tools/metric_monitor/api.py", line 54, in whether_is_abnormal_metric
with open(f"./alert_results/{current_diag_time}/{metric_name}.html", "w") as f:
FileNotFoundError: [Errno 2] No such file or directory: './alert_results/2023-12-14-19:24:58/cpu_usage.html'
a = open(f"./alert_results/{current_diag_time}/{metric_name}.html", "w")
Python 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.18.1 -- An enhanced Interactive Python. Type '?' for help.
PyDev console: using IPython 8.18.1
Traceback (most recent call last):
File "/root/miniconda3/envs/D-Bot/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3550, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in
a = open(f"./alert_results/{current_diag_time}/{metric_name}.html", "w")
File "/root/miniconda3/envs/D-Bot/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 310, in _modified_open
return io_open(file, args, kwargs)
FileNotFoundError: [Errno 2] No such file or directory: './alert_results/2023-12-14-19:24:58/cpu_usage.html'
a = open(f"/home/workspace/YYG/FromS/DB-GPT/alert_results/alert_results/{current_diag_time}/{metric_name}.html", "w")
Traceback (most recent call last):
File "/root/miniconda3/envs/D-Bot/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3550, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in
a = open(f"/home/workspace/YYG/FromS/DB-GPT/alert_results/alert_results/{current_diag_time}/{metric_name}.html", "w")
File "/root/miniconda3/envs/D-Bot/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 310, in _modified_open
return io_open(file, *args, **kwargs)
FileNotFoundError: [Errno 2] No such file or directory: '/home/workspace/YYG/FromS/DB-GPT/alert_results/alert_results/2023-12-14-19:24:58/cpu_usage.html'
import sys; print('Python %s on %s' % (sys.version, sys.platform)) /root/miniconda3/envs/D-Bot/bin/python /root/.pycharm_helpers/pydev/pydevd.py --multiprocess --qt-support=auto --client localhost --port 37745 --file /home/workspace/YYG/FromS/DB-GPT/main.py Connected to pydev debugger (build 232.8660.197) 系统启动java的jvm虚拟环境成功 12/14/2023 19:25:09 - ERROR - root - obtain_historical_queries_statistics Fails! 12/14/2023 19:25:09 - WARNING - root - Unused arguments: {'model': 'diag-llama'} 12/14/2023 19:25:09 - WARNING - root - Unused arguments: {'model': 'diag-llama'} 12/14/2023 19:25:09 - WARNING - root - Unused arguments: {'model': 'diag-llama'} 12/14/2023 19:25:09 - WARNING - root - Unused arguments: {'model': 'diag-llama'} 12/14/2023 19:25:09 - WARNING - root - Unused arguments: {'model': 'diag-llama'} 12/14/2023 19:25:09 - WARNING - root - Unused arguments: {'model': 'diag-llama'} Report Initialization! 0%| | 0/1====================== Initialization ====================== rank : 0 local_rank : 0 world_size : 1 local_size : 1 master : localhost:10010 device : 0 cpus : [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1 3, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 2 4, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 3 5, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 4 6, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 5 7, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 6 8, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 7 9, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 9 0, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118 , 119, 120, 121, 122, 123, 124, 125, 126, 12 7, 128, 129, 130, 131, 132, 133, 134, 135, 1 36, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162 , 163, 164, 165, 166, 167, 168, 169, 170, 17 1, 172, 173, 174, 175] /root/miniconda3/envs/D-Bot/lib/python3.10/site-packages/bmtrain/synchronize.py:15: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() nccl.allReduce(barrier.storage(), barrier.storage(), 'sum', config['comm']) args.load is not None, start to load checkpoints /home/workspace/YYG/YYG/D-Bot/DiagLlama/DiagLlama.pt [INFO][2023-12-14 19:25:39][jeeves-hpc-gpu00][inference.py:33:105510] - load model in 21.73s You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the
legacy
(previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, setlegacy=False
. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 [INFO][2023-12-14 19:25:40][jeeves-hpc-gpu00][inference.py:38:105510] - load tokenizer in 1.27s finish loading 100%|██████████████████████████████████████████████████████████████████████| 1/1 Role Assignment! 100%|██████████████████████████████████████████████████████████████████████| 1/1 12/14/2023 19:26:07 - INFO - sentence_transformers.SentenceTransformer - Load pretrained SentenceTransformer: ./localized_llms/sentence_embedding/sentence-transformer/ 12/14/2023 19:26:10 - INFO - sentence_transformers.SentenceTransformer - Use pytorch device: cuda Batches: 100%|████████████████████████████████████| 1/1 [00:00<00:00, 1.46it/s] CpuExpert Diagnosis!insert into table1 select generate_series...
was identified as potentially problematic due to its large scale data insertion within a short timeframe. This type of operations is resource-intensive and could significantly impact system performance. Understanding this knowledge allows us to better diagnose and find solutions for such issues in the future.