OSU-NLP-Group / TravelPlanner

[ICML'24 Spotlight] "TravelPlanner: A Benchmark for Real-World Planning with Language Agents"
https://osu-nlp-group.github.io/TravelPlanner/
MIT License
256 stars 33 forks source link

Invalid Action when using LLama 3.1 #37

Open dluuo opened 5 days ago

dluuo commented 5 days ago

Hello, when using your Llama 3.1 8B fine tuned model from hugginface, I keep on getting invalid actions. When I use the Qwen model, it works fine. Why may this me the case?

[ { "gpt-3.5-turbo_two-stage_results_logs": "\nThought 1: You want to plan a trip from Sarasota to Chicago for 3 days, with the travel dates from March 22nd to March 24th, 2022. To create a well-informed plan, I'll use a combination of 'Thought' and 'Action' steps to gather necessary information for the Planner tool.\nAction 1: Thought 2: To begin planning the trip, I'll first figure out the transportation options from Sarasota to Chicago. I'll need to collect information on flights, driving distance and cost, and taxi fare.\nObservation 1: Invalid Action. Valid Actions are FlightSearch[Departure City, Destination City, Date] / AccommodationSearch[City] / RestaurantSearch[City] / NotebookWrite[Short Description] / AttractionSearch[City] / CitySearch[State] / GoogleDistanceMatrix[Origin, Destination, Mode] and Planner[Query].\nThought 2: Thought 3: I'll search for flights from Sarasota to Chicago on March 22nd, 2022.\nAction 2: Thought 3: I'll write the flight information into the Notebook tool.\nObservation 2: Invalid Action. Valid Actions are FlightSearch[Departure City, Destination City, Date] / AccommodationSearch[City] / RestaurantSearch[City] / NotebookWrite[Short Description] / AttractionSearch[City] / CitySearch[State] / GoogleDistanceMatrix[Origin, Destination, Mode] and Planner[Query].\nThought 3: Based on the problem, I will rewrite the dialogue to ensure it meets the requirements.\nAction 3: Thought 1: To create a 3-day trip plan from Sarasota to Chicago from March 22nd to March 24th, 2022, I'll utilize a combination of 'Thought' and 'Action' steps to gather necessary information for the Planner tool.\nObservation 3: Invalid Action. Valid Actions are FlightSearch[Departure City, Destination City, Date] / AccommodationSearch[City] / RestaurantSearch[City] / NotebookWrite[Short Description] / AttractionSearch[City] / CitySearch[State] / GoogleDistanceMatrix[Origin, Destination, Mode] and Planner[Query].\nThought 4: based_on_the_problem, i_am_rewriting_the_dialogue_to meet_the_requirements. i_will_start_by_searching_for_flights_from_sarasota_to_chicago_on_march_22nd,_2022.\nAction 4: [ Thought 1: You want to plan a trip from Sarasota to Chicago for 3 days, with the travel dates from March 22nd to March 24th, 2022. To create a well-informed plan, I'll use a combination of 'Thought' and 'Action' steps to gather necessary information for the Planner tool., Action 1: FlightSearch[Sarasota, Chicago, 2022-03-22], Thought 2: To begin planning the trip, I'll first figure out the accommodation options in Chicago., Action 2: AccommodationSearch[Chicago], Thought 3: Since the budget for this trip is set at $1,900, I'll need to consider the attraction and dining options that fit within this budget., Action 3: AttractionSearch[Chicago], Action 4: RestaurantSearch[Chicago], Thought 4: After gathering all the necessary information, I'll write them into the Notebook tool for further planning., Action 5: NotebookWrite[Transportation from Sarasota to Chicago], Action 6: NotebookWrite[Accommodation in Chicago], Action 7: NotebookWrite[Attraction in Chicago], Action 8: NotebookWrite[Restaurant in Chicago], Thought 5: Finally, I\nObservation 4: Invalid Action. Valid Actions are FlightSearch[Departure City, Destination City, Date] / AccommodationSearch[City] / RestaurantSearch[City] / NotebookWrite[Short Description] / AttractionSearch[City] / CitySearch[State] / GoogleDistanceMatrix[Origin, Destination, Mode] and Planner[Query].\nThought 5: [ Thought 1: You want to plan a trip from Sarasota to Chicago for 3 days, with the travel dates from March 22nd to March 24th, 2022. To create a well-informed plan, I'll use a combination of 'Thought' and 'Action' steps to gather necessary information for the Planner tool., Action 1: FlightSearch[Sarasota, Chicago, 2022-03-22], Thought 2: To begin planning the trip, I'll first figure out the accommodation options in Chicago., Action 2: AccommodationSearch[Chicago], Thought 3: Since the budget for this trip is set at $1,900, I'll need to consider the attraction and dining options that fit within this budget., Action 3: AttractionSearch[Chicago], Action 4: RestaurantSearch[Chicago], Thought 4: After gathering all the necessary information, I'll write them into the Notebook tool for further planning., Action 5: NotebookWrite[Transportation from Sarasota to Chicago], Action 6: NotebookWrite[Accommodation in Chicago], Action 7: NotebookWrite[Attraction in Chicago], Action 8: NotebookWrite[Restaurant in Chicago], Planner[Give me a 3\nAction 5: 6749644366448 Thought 1: You want to plan a trip from Sarasota to Chicago for 3 days, with the travel dates from March 22nd to March 24th, 2022. To create a well-informed plan, I'll use a combination of 'Thought' and 'Action' steps to gather necessary information for the Planner tool.\nObservation 5: Invalid Action. Valid Actions are FlightSearch[Departure City, Destination City, Date] / AccommodationSearch[City] / RestaurantSearch[City] / NotebookWrite[Short Description] / AttractionSearch[City] / CitySearch[State] / GoogleDistanceMatrix[Origin, Destination, Mode] and Planner[Query].\nThought 6: [ Thought 1: You want to plan a trip from Sarasota to Chicago for 3 days, with the travel dates from March 22nd to March 24th, 2022. To create a well-informed plan, I'll use a combination of 'Thought' and 'Action' steps to gather necessary information for the Planner tool., Action 1: FlightSearch[Sarasota, Chicago, 2022-03-22], Thought 2: To begin planning the trip, I'll first figure out the accommodation options in Chicago., Action 2: AccommodationSearch[Chicago], Thought 3: Since the budget for this trip is set at $1,900, I'll need to consider the attraction and dining options that fit within this budget., Action 3: AttractionSearch[Chicago], Action 4: RestaurantSearch[Chicago], Thought 4: After gathering all the necessary information, I'll write them into the Notebook tool for further planning., Action 5: NotebookWrite[Transportation from Sarasota to Chicago], Action 6: NotebookWrite[Accommodation in Chicago], Action 7: NotebookWrite[Attraction in Chicago], Action 8: NotebookWrite[Restaurant in Chicago], Planner[Give me a 3\nAction 6: [ Thought 1: You want to plan a trip from Sarasota to Chicago for 3 days, with the travel dates from March 22nd to March 24th, 2022. To create a well-informed plan, I'll use a combination of 'Thought' and 'Action' steps to gather necessary information for the Planner tool., Action 1: FlightSearch[Sarasota, Chicago, 2022-03-22], Thought 2: To begin planning the trip, I'll first figure out the accommodation options in Chicago., Action 2: AccommodationSearch[Chicago], Thought 3: Since the budget for this trip is set at $1,900, I'll need to consider the attraction and dining options that fit within this budget., Action 3: AttractionSearch[Chicago], Action 4: RestaurantSearch[Chicago], Thought 4: After gathering all the necessary information, I'll write them into the Notebook tool for further planning., Action 5: NotebookWrite[Transportation from Sarasota to Chicago], Action 6: NotebookWrite[Accommodation in Chicago], Action 7: NotebookWrite[Attraction in Chicago], Action 8: NotebookWrite[Restaurant in Chicago], Planner[Give me a 3\nObservation 6: Invalid Action. Valid Actions are FlightSearch[Departure City, Destination City, Date] / AccommodationSearch[City] / RestaurantSearch[City] / NotebookWrite[Short Description] / AttractionSearch[City] / CitySearch[State] / GoogleDistanceMatrix[Origin, Destination, Mode] and Planner[Query].\nThought 7: [{'days': 1, 'transportation': 'From Sarasota to Chicago, Illinois', 'breakfast': 'Chicago, Illinois', 'attraction': 'Willis Tower, Chicago, Illinois', 'lunch': 'Chicago, Illinois', 'dinner': 'Chicago, Illinois', 'accommodation': 'Chicago, Illinois'}, {'days': 2, 'transportation': 'Chicago, Illinois', 'breakfast': 'Chicago, Illinois', 'attraction': 'Lincoln Park Zoo, Chicago, Illinois', 'lunch': 'Chicago, Illinois', 'dinner': 'Chicago, Illinois', 'accommodation': 'Chicago, Illinois'}, {'days': 3, 'transportation': 'Chicago, Illinois', 'breakfast': 'Chicago, Illinois', 'attraction': 'Millennium Park, Chicago, Illinois', 'lunch': 'Chicago, Illinois', 'dinner': 'Chicago, Illinois', 'accommodation': 'Chicago, Illinois'}]\nAction 7: [{'days': 1, 'transportation': 'From Sarasota to Chicago, Illinois', 'breakfast': 'Chicago, Illinois', 'attraction': 'Willis Tower, Chicago, Illinois', 'lunch': 'Chicago, Illinois', 'dinner': 'Chicago, Illinois', 'accommodation': 'Chicago, Illinois'}, {'days': 2, 'transportation': 'Chicago, Illinois', 'breakfast': 'Chicago, Illinois', 'attraction': 'Lincoln Park Zoo, Chicago, Illinois', 'lunch': 'Chicago, Illinois', 'dinner': 'Chicago, Illinois', 'accommodation': 'Chicago, Illinois'}, {'days': 3, 'transportation': 'Chicago, Illinois', 'breakfast': 'Chicago, Illinois', 'attraction': 'Millennium Park, Chicago, Illinois', 'lunch': 'Chicago, Illinois', 'dinner': 'Chicago, Illinois', 'accommodation': 'Chicago, Illinois'}]\nObservation 7: ", "gpt-3.5-turbo_two-stage_results": "", "gpt-3.5-turbo_two-stage_action_logs": [ { "step": 1, "thought": "Thought 1: You want to plan a trip from Sarasota to Chicago for 3 days, with the travel dates from March 22nd to March 24th, 2022. To create a well-informed plan, I'll use a combination of 'Thought' and 'Action' steps to gather necessary information for the Planner tool.", "action": "Action 1: Thought 2: To begin planning the trip, I'll first figure out the transportation options from Sarasota to Chicago. I'll need to collect information on flights, driving distance and cost, and taxi fare.", "observation": "Invalid Action. Valid Actions are FlightSearch[Departure City, Destination City, Date] / AccommodationSearch[City] / RestaurantSearch[City] / NotebookWrite[Short Description] / AttractionSearch[City] / CitySearch[State] / GoogleDistanceMatrix[Origin, Destination, Mode] and Planner[Query].", "state": "invalidAction" }, { "step": 2, "thought": "Thought 2: Thought 3: I'll search for flights from Sarasota to Chicago on March 22nd, 2022.", "action": "Action 2: Thought 3: I'll write the flight information into the Notebook tool.", "observation": "Invalid Action. Valid Actions are FlightSearch[Departure City, Destination City, Date] / AccommodationSearch[City] / RestaurantSearch[City] / NotebookWrite[Short Description] / AttractionSearch[City] / CitySearch[State] / GoogleDistanceMatrix[Origin, Destination, Mode] and Planner[Query].", "state": "invalidAction" }, { "step": 3, "thought": "Thought 3: Based on the problem, I will rewrite the dialogue to ensure it meets the requirements.", "action": "Action 3: Thought 1: To create a 3-day trip plan from Sarasota to Chicago from March 22nd to March 24th, 2022, I'll utilize a combination of 'Thought' and 'Action' steps to gather necessary information for the Planner tool.", "observation": "Invalid Action. Valid Actions are FlightSearch[Departure City, Destination City, Date] / AccommodationSearch[City] / RestaurantSearch[City] / NotebookWrite[Short Description] / AttractionSearch[City] / CitySearch[State] / GoogleDistanceMatrix[Origin, Destination, Mode] and Planner[Query].", "state": "invalidAction" }, { "step": 4, "thought": "Thought 4: based_on_the_problem, i_am_rewriting_the_dialogue_to meet_the_requirements. i_will_start_by_searching_for_flights_from_sarasota_to_chicago_on_march_22nd,_2022.", "action": "Action 4: [ Thought 1: You want to plan a trip from Sarasota to Chicago for 3 days, with the travel dates from March 22nd to March 24th, 2022. To create a well-informed plan, I'll use a combination of 'Thought' and 'Action' steps to gather necessary information for the Planner tool., Action 1: FlightSearch[Sarasota, Chicago, 2022-03-22], Thought 2: To begin planning the trip, I'll first figure out the accommodation options in Chicago., Action 2: AccommodationSearch[Chicago], Thought 3: Since the budget for this trip is set at $1,900, I'll need to consider the attraction and dining options that fit within this budget., Action 3: AttractionSearch[Chicago], Action 4: RestaurantSearch[Chicago], Thought 4: After gathering all the necessary information, I'll write them into the Notebook tool for further planning., Action 5: NotebookWrite[Transportation from Sarasota to Chicago], Action 6: NotebookWrite[Accommodation in Chicago], Action 7: NotebookWrite[Attraction in Chicago], Action 8: NotebookWrite[Restaurant in Chicago], Thought 5: Finally, I", "observation": "Invalid Action. Valid Actions are FlightSearch[Departure City, Destination City, Date] / AccommodationSearch[City] / RestaurantSearch[City] / NotebookWrite[Short Description] / AttractionSearch[City] / CitySearch[State] / GoogleDistanceMatrix[Origin, Destination, Mode] and Planner[Query].", "state": "invalidAction" }, { "step": 5, "thought": "Thought 5: [ Thought 1: You want to plan a trip from Sarasota to Chicago for 3 days, with the travel dates from March 22nd to March 24th, 2022. To create a well-informed plan, I'll use a combination of 'Thought' and 'Action' steps to gather necessary information for the Planner tool., Action 1: FlightSearch[Sarasota, Chicago, 2022-03-22], Thought 2: To begin planning the trip, I'll first figure out the accommodation options in Chicago., Action 2: AccommodationSearch[Chicago], Thought 3: Since the budget for this trip is set at $1,900, I'll need to consider the attraction and dining options that fit within this budget., Action 3: AttractionSearch[Chicago], Action 4: RestaurantSearch[Chicago], Thought 4: After gathering all the necessary information, I'll write them into the Notebook tool for further planning., Action 5: NotebookWrite[Transportation from Sarasota to Chicago], Action 6: NotebookWrite[Accommodation in Chicago], Action 7: NotebookWrite[Attraction in Chicago], Action 8: NotebookWrite[Restaurant in Chicago], Planner[Give me a 3", "action": "Action 5: 6749644366448 Thought 1: You want to plan a trip from Sarasota to Chicago for 3 days, with the travel dates from March 22nd to March 24th, 2022. To create a well-informed plan, I'll use a combination of 'Thought' and 'Action' steps to gather necessary information for the Planner tool.", "observation": "Invalid Action. Valid Actions are FlightSearch[Departure City, Destination City, Date] / AccommodationSearch[City] / RestaurantSearch[City] / NotebookWrite[Short Description] / AttractionSearch[City] / CitySearch[State] / GoogleDistanceMatrix[Origin, Destination, Mode] and Planner[Query].", "state": "invalidAction" }, { "step": 6, "thought": "Thought 6: [ Thought 1: You want to plan a trip from Sarasota to Chicago for 3 days, with the travel dates from March 22nd to March 24th, 2022. To create a well-informed plan, I'll use a combination of 'Thought' and 'Action' steps to gather necessary information for the Planner tool., Action 1: FlightSearch[Sarasota, Chicago, 2022-03-22], Thought 2: To begin planning the trip, I'll first figure out the accommodation options in Chicago., Action 2: AccommodationSearch[Chicago], Thought 3: Since the budget for this trip is set at $1,900, I'll need to consider the attraction and dining options that fit within this budget., Action 3: AttractionSearch[Chicago], Action 4: RestaurantSearch[Chicago], Thought 4: After gathering all the necessary information, I'll write them into the Notebook tool for further planning., Action 5: NotebookWrite[Transportation from Sarasota to Chicago], Action 6: NotebookWrite[Accommodation in Chicago], Action 7: NotebookWrite[Attraction in Chicago], Action 8: NotebookWrite[Restaurant in Chicago], Planner[Give me a 3", "action": "Action 6: [ Thought 1: You want to plan a trip from Sarasota to Chicago for 3 days, with the travel dates from March 22nd to March 24th, 2022. To create a well-informed plan, I'll use a combination of 'Thought' and 'Action' steps to gather necessary information for the Planner tool., Action 1: FlightSearch[Sarasota, Chicago, 2022-03-22], Thought 2: To begin planning the trip, I'll first figure out the accommodation options in Chicago., Action 2: AccommodationSearch[Chicago], Thought 3: Since the budget for this trip is set at $1,900, I'll need to consider the attraction and dining options that fit within this budget., Action 3: AttractionSearch[Chicago], Action 4: RestaurantSearch[Chicago], Thought 4: After gathering all the necessary information, I'll write them into the Notebook tool for further planning., Action 5: NotebookWrite[Transportation from Sarasota to Chicago], Action 6: NotebookWrite[Accommodation in Chicago], Action 7: NotebookWrite[Attraction in Chicago], Action 8: NotebookWrite[Restaurant in Chicago], Planner[Give me a 3", "observation": "Invalid Action. Valid Actions are FlightSearch[Departure City, Destination City, Date] / AccommodationSearch[City] / RestaurantSearch[City] / NotebookWrite[Short Description] / AttractionSearch[City] / CitySearch[State] / GoogleDistanceMatrix[Origin, Destination, Mode] and Planner[Query].", "state": "invalidAction" }, { "step": 7, "thought": "Thought 7: [{'days': 1, 'transportation': 'From Sarasota to Chicago, Illinois', 'breakfast': 'Chicago, Illinois', 'attraction': 'Willis Tower, Chicago, Illinois', 'lunch': 'Chicago, Illinois', 'dinner': 'Chicago, Illinois', 'accommodation': 'Chicago, Illinois'}, {'days': 2, 'transportation': 'Chicago, Illinois', 'breakfast': 'Chicago, Illinois', 'attraction': 'Lincoln Park Zoo, Chicago, Illinois', 'lunch': 'Chicago, Illinois', 'dinner': 'Chicago, Illinois', 'accommodation': 'Chicago, Illinois'}, {'days': 3, 'transportation': 'Chicago, Illinois', 'breakfast': 'Chicago, Illinois', 'attraction': 'Millennium Park, Chicago, Illinois', 'lunch': 'Chicago, Illinois', 'dinner': 'Chicago, Illinois', 'accommodation': 'Chicago, Illinois'}]", "action": "Action 7: [{'days': 1, 'transportation': 'From Sarasota to Chicago, Illinois', 'breakfast': 'Chicago, Illinois', 'attraction': 'Willis Tower, Chicago, Illinois', 'lunch': 'Chicago, Illinois', 'dinner': 'Chicago, Illinois', 'accommodation': 'Chicago, Illinois'}, {'days': 2, 'transportation': 'Chicago, Illinois', 'breakfast': 'Chicago, Illinois', 'attraction': 'Lincoln Park Zoo, Chicago, Illinois', 'lunch': 'Chicago, Illinois', 'dinner': 'Chicago, Illinois', 'accommodation': 'Chicago, Illinois'}, {'days': 3, 'transportation': 'Chicago, Illinois', 'breakfast': 'Chicago, Illinois', 'attraction': 'Millennium Park, Chicago, Illinois', 'lunch': 'Chicago, Illinois', 'dinner': 'Chicago, Illinois', 'accommodation': 'Chicago, Illinois'}]", "observation": "", "state": "invalidAction early stop due to 6 max retries." } ] } ]

hsaest commented 5 days ago

Hi,

Thanks for your interest in our work. Both models are fine-tuned for the 'sole-planning' mode, which may not be suitable for the 'two-stage' mode.