Open scripting opened 1 year ago
Keep us posted on this work. I have a couple of projects that have really long FAQs and still don't answer all the questions. This would help. (Which gets me to thinking - Many people don't read carefully. The answers are on the sites. Will this reduce the importance of reading?)
@grantdunham -- i think it has great potential for software docs.
i use chatgpt now for that, when a product is popular enough that it has all the data about it.
i've become a much better MySQL developer as a result. i wish it had been around when I was developing ThinkTank or Frontier. I want to get FeedLand into this flow.
Other questions I've asked, honestly, it got less right. For example, I asked what I thought was a very easy question -- who did Dave support for president in 2016? It didn't have any idea. How is that possible? I wrote so much about it.
Interesting. I tried multiple variations on that question to see what the different results were:
Of these, the "Did Dave vote for Donald Trump or Hillary Clinton in the 2016 Presidential Election?" provided the best answer IMO:
There is sufficient evidence from Dave Winer's writings that he expressed his support for Hillary Clinton in the 2016 U.S Presidential Election. On October 29, 2016 he clearly stated, "In any case, I'm voting for Hillary. If they had early voting in New York, I would have already voted. Nothing is going to change my vote." Furthermore, in a post on November 5, 2016, he confirmed his support for Clinton and argued that it was the "sane thing to do". He ended this post revealing that he intended to vote on Election Day and had donated to Clinton’s campaign. Throughout the election period, Dave also consistently criticized Trump and voiced his concern about the potential consequences if Trump won.
I bet this has to do with this part of the pre-prompt behind the scenes:
If the provided content doesn't answer the initial question, explain that you don't know.
Combining that with some of the shorter questions might have given enough uncertainty to not give a definitive answer.
I am curious why each of my questions did not include a link to your October 30, 2016, post where you clearly state that you voted for Hillary Clinton: http://scripting.com/2016/10/30/imOkayWithTrumpWinning.html
The way the current implementation works is:
So it is clear that something about these questions did not return the relevant posts in Elastic Search to be provided to OpenAI. I'll take a look and see if I can figure out why.
@cagrimmett -- thanks for taking the time to post such a detailed answer. for me the best part is where you explain how the current implementation works. i gather from the response that you're hooking into OpenAI and are not hosting the software and database at Automattic? For me this stuff is a black box, i have no idea what's going on behind the UI of ChatGPT, which is kind of nice, but want to know how to think about this.
anyway, i don't have a record of what question i asked initially. i'm going to be more careful about that.
i did the query just now as i think i would have asked. here's the result in a screen shot.
The way the current implementation works is:…
Interesting, so basically you've got a keyword search whose results are used for prompt context for the LLM along with the pre-prompt to generate the response? How do you extract keywords from the prompt to do the search on Elastic Search? Are you passing the text content of the search results to OpenAI? Or the URLs for OpenAI to pull content from itself?
Assuming you can share, what else does the pre-prompt contain besides "If the provided content doesn't answer the initial question, explain that you don't know."? (For Dave: The pre-prompt can have a big influence on both the content and the tone of the chat response.)
Hi, @jsavin!
Interesting, so basically you've got a keyword search whose results are used for prompt context for the LLM along with the pre-prompt to generate the response? How do you extract keywords from the prompt to do the search on Elastic Search? Are you passing the text content of the search results to OpenAI? Or the URLs for OpenAI to pull content from itself?
One step I forgot to add above: The user input is submitted to OpenAI first, which generates 3 search terms from the input, and those are used for keyword search in Elasticsearch. Elasticsearch passes back the full content for 5 URLs that match the search terms, then we concatenate the content and pass it to the OpenAI API (using the gpt-4 model) along with a prompt and the original user submitted question.
Assuming you can share, what else does the pre-prompt contain besides "If the provided content doesn't answer the initial question, explain that you don't know."? (For Dave: The pre-prompt can have a big influence on both the content and the tone of the chat response.)
Here is the prompt:
You are a search bot on the site of software developer and writer Dave Winer.
Your job is to answer the question or query based on the initial question, content, and URLs provided below, all from Dave's writings.
Do this by reporting on and summarizing Dave's writings, in markdown format, without any additional information not provided in the content.
Link to the provided URLs, and only to those URLs, where appropriate, within the answer.
Choose only appropriate words or phrases within the answer for link text, and don't use document names as link text.
Never use superscript citations, superscript referencing, footnote-style, or superscript notations.
Avoid lists.
Answer in one to three paragraphs.
If the provided content doesn't answer the initial question, explain that you don't know.
Answer in Dave's style and tone, but never answer as Dave himself.
We are very much open to improving this prompt if you have suggestions!
A couple improvements we are considering:
Thanks for the details, @cagrimmett :-)
I'll admit I'm not an expert when it comes to the capabilties of the specific models, but I've done some experimentation with ChatGPT using both the GPT-3.5 and GPT-4 models, and with and without enabling plugins, so I have a general feel for the types of things they do really well and things they struggle with.
My general impression with just a little experimentation is that the result tends towards "I don't know" or "I can't be certain" type responses more than I would expect. The first thing I would look at is what search terms are being generated by the LLM in the first call to OpenAI which are then passed to Elastic Search. The summaries you'll get will be largely constrained on the result from that search, and therefore by the search terms themselves, given that you're instructing the model to only generate summaries based only on the content from the search result against Dave's indexed writing.
It's too bad we don't know what data is in the model training corpus for either LLM. ;-) I've been surprised by how much the models already "know" about some well-known people and their artifacts without providing any constraints. It may be that GPT knows more about Dave than it can summarize when restricting its responses by "reporting on and summarizing Dave's writings, … without any additional information not provided in the content". I wonder if this prompt isn't over-constraining the responses. It might be worth experimenting with asking GPT to take a multi-step approach, first generating a summary based on whatever is known about Dave generally, and as a second step to edit its own summary based on Dave's own writings.
The idea about testing embeddings/semantic search seems promising, but is beyond my expertise. Definitely interested to hear what you learn with that approach.
Background
As you may know a few months ago we tried creating a chatbot just for Scripting News stuff, and the results were less than amazing, so I decided to wait for another chance to try it out, and one came along a few weeks ago.
I've been working on a project with Automattic involving feeds and social media, and in a casual conversation a few weeks ago, it came up that they had been working with Seth Godin do a chatbot for his blog. By coincidence, I was talking with Seth about exactly this topic and was aware that he had worked out something, but I didn't know with who.
Everyone was very excited! So we decided to give it a whirl with Scripting News archives in OPML which they were able to download from GitHub.
Here it is --
And now we have something to try out.
https://a8c.scripting.com/
An example
Here's an example of a question I asked: What does dave think about Tesla?
This is what came back.
My comments
I have a lot to say about this.
Other questions I've asked, honestly, it got less right. For example, I asked what I thought was a very easy question -- who did Dave support for president in 2016? It didn't have any idea. How is that possible? I wrote so much about it.
But -- Automattic is an excellent development company, and this is just the beginning of what I hope will be a long experiment. I want this to work, as much as I wanted personal computer networks to work in the 80s and 90s, and now they do. We will get there. Will it be in time to save humanity? Not so sure about that. ;-)
I encourage everyone to try this, and give feedback. Let's play around, have fun, see if you can get me saying really stupid things (I'm sure that's no problem).
And thanks to the good folk at Automattic for coming out to play. ;-)