ai4co / LLM-as-HH

Large Language Models as Hyper-Heuristics for Combinatorial Optimization (CO)
MIT License
91 stars 21 forks source link

RuntimeError due to Invalid Seed Function Evaluation #5

Closed lgy0404 closed 4 months ago

lgy0404 commented 4 months ago

When running ReEvo with the command python main.py problem=tsp_aco, an error occurs during the evaluation of the seed function, leading to a RuntimeError. The error message suggests that the seed function is invalid, but it doesn't provide specific details about what caused the invalid evaluation.

Steps to Reproduce:

  1. Set up the environment as described in the README.
  2. Run the command python main.py problem=tsp_aco.
  3. Observe the RuntimeError related to the invalid seed function evaluation.

Error Logs:

[2024-05-16 12:38:34,259][root][INFO] - Workspace: /remote-home/iot_liuguangyi/llm/LLM-as-HH/outputs/main/2024-05-16_12-38-34 [2024-05-16 12:38:34,259][root][INFO] - Project Root: /remote-home/iot_liuguangyi/llm/LLM-as-HH [2024-05-16 12:38:34,259][root][INFO] - Using LLM: gpt-3.5-turbo [2024-05-16 12:38:34,259][root][INFO] - Using Algorithm: reevo [2024-05-16 12:38:34,668][root][INFO] - Problem: tsp_aco [2024-05-16 12:38:34,668][root][INFO] - Problem description: Solving Traveling Salesman Problem (TSP) via stochastic solution sampling following "heuristics". TSP requires finding the shortest path that visits all given nodes and returns to the starting node. [2024-05-16 12:38:34,668][root][INFO] - Function name: heuristics [2024-05-16 12:38:34,668][root][INFO] - Evaluating seed function... [2024-05-16 12:38:34,669][root][INFO] - Seed function code: import numpy as np def heuristics_v2(distance_matrix: np.ndarray) -> np.ndarray: return 1 / distance_matrix [2024-05-16 12:38:34,669][root][INFO] - Iteration 0: Running Code 0 [2024-05-16 12:38:36,349][root][INFO] - Iteration 0: Code Run 0 successful! [2024-05-16 12:38:56,351][root][INFO] - Error for response_id 0: Command '['python', '-u', '/remote-home/iot_liuguangyi/llm/LLM-as-HH/problems/tsp_aco/eval.py', '50', '/remote-home/iot_liuguangyi/llm/LLM-as-HH', 'train']' timed out after 19.99986495007761 seconds

Please let us know if you need further information or assistance in resolving this issue. Thank you!

henry-yeh commented 4 months ago

Hi there. Thanks for your interest! It seems the execution timed out after 20 seconds. You may try to set a longer allowed execution time, e.g.:

python main.py problem=tsp_aco timeout=60

lgy0404 commented 4 months ago

Hi there, thank you for your patience and detailed information. Despite my attempts to increase the execution time to 60 seconds and 200 seconds, the problem remains unresolved. Error Logs with timeout 60: [2024-05-17 01:54:54,061][root][INFO] - Workspace: /remote-home/iot_liuguangyi/llm/LLM-as-HH/outputs/main/2024-05-17_01-54-54 [2024-05-17 01:54:54,061][root][INFO] - Project Root: /remote-home/iot_liuguangyi/llm/LLM-as-HH [2024-05-17 01:54:54,061][root][INFO] - Using LLM: gpt-3.5-turbo [2024-05-17 01:54:54,061][root][INFO] - Using Algorithm: reevo [2024-05-17 01:54:54,509][root][INFO] - Problem: tsp_aco [2024-05-17 01:54:54,509][root][INFO] - Problem description: Solving Traveling Salesman Problem (TSP) via stochastic solution sampling following "heuristics". TSP requires finding the shortest path that visits all given nodes and returns to the starting node. [2024-05-17 01:54:54,509][root][INFO] - Function name: heuristics [2024-05-17 01:54:54,513][root][INFO] - Evaluating seed function... [2024-05-17 01:54:54,513][root][INFO] - Seed function code: import numpy as np def heuristics_v2(distance_matrix: np.ndarray) -> np.ndarray: return 1 / distance_matrix [2024-05-17 01:54:54,513][root][INFO] - Iteration 0: Running Code 0 [2024-05-17 01:54:55,797][root][INFO] - Iteration 0: Code Run 0 successful! [2024-05-17 01:55:55,798][root][INFO] - Error for response_id 0: Command '['python', '-u', '/remote-home/iot_liuguangyi/llm/LLM-as-HH/problems/tsp_aco/eval.py', '50', '/remote-home/iot_liuguangyi/llm/LLM-as-HH', 'train']' timed out after 59.9999217200093 seconds

Error Logs with timeout 200: [2024-05-17 01:58:19,536][root][INFO] - Workspace: /remote-home/iot_liuguangyi/llm/LLM-as-HH/outputs/main/2024-05-17_01-58-19 [2024-05-17 01:58:19,536][root][INFO] - Project Root: /remote-home/iot_liuguangyi/llm/LLM-as-HH [2024-05-17 01:58:19,536][root][INFO] - Using LLM: gpt-3.5-turbo [2024-05-17 01:58:19,536][root][INFO] - Using Algorithm: reevo [2024-05-17 01:58:19,934][root][INFO] - Problem: tsp_aco [2024-05-17 01:58:19,934][root][INFO] - Problem description: Solving Traveling Salesman Problem (TSP) via stochastic solution sampling following "heuristics". TSP requires finding the shortest path that visits all given nodes and returns to the starting node. [2024-05-17 01:58:19,934][root][INFO] - Function name: heuristics [2024-05-17 01:58:19,934][root][INFO] - Evaluating seed function... [2024-05-17 01:58:19,934][root][INFO] - Seed function code: import numpy as np def heuristics_v2(distance_matrix: np.ndarray) -> np.ndarray: return 1 / distance_matrix [2024-05-17 01:58:19,934][root][INFO] - Iteration 0: Running Code 0 [2024-05-17 01:58:21,266][root][INFO] - Iteration 0: Code Run 0 successful! [2024-05-17 02:01:41,266][root][INFO] - Error for response_id 0: Command '['python', '-u', '/remote-home/iot_liuguangyi/llm/LLM-as-HH/problems/tsp_aco/eval.py', '50', '/remote-home/iot_liuguangyi/llm/LLM-as-HH', 'train']' timed out after 199.99996346002445 seconds

henry-yeh commented 4 months ago

Could you please check out the stdout file in outputs/main/2024-05-17_01-58-19/problem_iter0_stdout0.txt and provide more information?

lgy0404 commented 4 months ago

@henry-yeh Thank you for your response. By checking the file outputs/main/2024-05-17_01-58-19/problem_iter0_stdout0.txt, I found that the issue was still due to the timeout being too short. I resolved the problem by setting it to 1000. Could you explain the meaning of timeout and the rules for setting it?

henry-yeh commented 4 months ago

On my Mac, running the seed evaluation takes about 3 seconds. So it's really weird it takes that much time for you. Maybe your CPU is occupied. The rule of setting it really depends on your hardware.

If the time of evaluating a heuristic exceeds timeout, it is killed. We set this parameter because LLM may generate code that involves infinite loops.

lgy0404 commented 4 months ago

I've upgraded to an i7-14700K, and now everything runs smoothly. Thank you for your help!