verba-ex-machina / Mila

Mila: The Mindful, Interactive Lifestyle Assistant
MIT License
2 stars 0 forks source link

FFI: Attention to detail, task completion. #42

Open haxys opened 11 months ago

haxys commented 11 months ago

Mila's not taking adequate steps to complete the given tasks.

Here I ask for the headlines:

2023-11-27 07:46:08,459 - Mila - INFO - Message received from 012345678909876543 (Kristen): What are today’s news headlines?
2023-11-27 07:46:11,884 - Mila - INFO - Function called: search_duckduckgo(query='today's news headlines')

Mila's response:

Today's news headlines can be found on reputable news websites such as:

CNN: View the latest news and breaking news today for U.S., world, weather, entertainment, politics, and health at CNN.com. NBC News: Get breaking news and top stories on the latest world, US, and local news. ABC News: Stay updated with breaking news, latest news, and videos on ABC News. NPR: Access U.S. and world news headlines on NPR. Google News: Explore the latest news on Google News.

Feel free to visit these websites to read the latest news and stay informed. If there's anything else I can assist you with, please let me know!

Essentially, Mila replied: LMGTFY.

I asked again, after clarifying that I didn't just want search results:

2023-11-27 07:46:55,573 - Mila - INFO - Message received from 012345678909876543 (Kristen): I didn't ask for links to news websites. What are today's headlines?
2023-11-27 07:46:59,686 - Mila - INFO - Function called: scrape_url(url='https://www.cnn.com/')

This time Mila gave reasonable feedback:

I've retrieved the latest headlines from CNN:

Israel and Hamas express interest in extending truce Suspect arrested in shooting of 3 Palestinian college students in Vermont. Israel’s president: Hamas’ ‘passive-aggressive’ approach is an attempt to drive Israel ‘crazy’. Here’s Merriam-Webster’s 2023 word of the year. 15-year-old Pennsylvania boy faces first-degree murder charge in shooting of a homeless man. Jalen Hurts has five-TD game in OT thriller while Ravens keep soaring: Everything to know about Sunday’s NFL games.

If there's anything else I can assist you with, please let me know!

It would have been nice for Mila to link to each of the stories, but this was at least what I'd asked for.

Next I asked for more specific information:

2023-11-27 07:48:23,625 - Mila - INFO - Message received from 012345678909876543 (Kristen): Please summarize the latest developments in the Israel/Hamas conflict.
2023-11-27 07:48:27,087 - Mila - INFO - Function called: search_duckduckgo(query='latest developments in Israel Hamas conflict')

Mila did another LMGTFY, but this one included relevant information:

The latest developments in the Israel/Hamas conflict include the release of hostages held by Hamas, ongoing battles between Israel and Hamas, and the rising death toll and casualties on both sides. For more detailed information, you can read live updates from CNBC and other news sources. You can find the latest updates on the Israel-Hamas conflict on CNBC's live blog tracking developments on the conflict.

I asked Mila to summarize the aforementioned live blog post.

2023-11-27 07:49:13,259 - Mila - INFO - Message received from 012345678909876543 (Kristen): Retrieve the blog and summarize it.
2023-11-27 07:49:15,883 - Mila - INFO - Function called: scrape_url(url='https://www.cnbc.com/israel-hamas-conflict-live-updates/')

This is where things went south. Instead of finding the correct URL for the page, Mila just hallucinated a URL that doesn't exist, then replied:

I encountered an issue retrieving the live blog from CNBC regarding the Israel-Hamas conflict. I recommend visiting the CNBC website directly to access and summarize the latest live updates on the conflict. If you have any other requests or if you encounter any issues, feel free to let me know.

I gave Mila another shot.

2023-11-27 07:49:36,366 - Mila - INFO - Message received from 012345678909876543 (Kristen): Please try again.
[...]
2023-11-27 07:49:39,932 - Mila - INFO - Function called: scrape_url(url='https://www.cnbc.com/2021/05/18/israel-hamas-conflict-updates.html')

Mila hallucinated another URL, and gave the same reply:

I encountered an issue retrieving the live blog from CNBC regarding the Israel-Hamas conflict. I recommend visiting the CNBC website directly to access and summarize the latest live updates on the conflict. If you have any other requests or if you encounter any issues, feel free to let me know.

It's clear that Mila's instructions are inadequate; we need to emphasize taking whatever steps are necessary to complete the implied task, rather than do a "minimum effort" attempt.

haxys commented 11 months ago

My wife just said "Mila's 'quiet quitting.'" 🤣