Open lastrosade opened 5 months ago
In case like me anyone else isn't familiar with that acronym: https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md
I experimented with this by allowing for parameters to be part of all URLs, and then telling llama3 to append two of them. It works very well:
Put this into the prompt for page generation (as well as for search results, or make it a system prompt). (Of course you will have to alter the url parsing a little to keep + strip out the parameters for further use also!)
Generate a webpage from the fictional site of '{url}' at the resource path of '{path}' with parameters: '{params}'. Make sure all links generated either link to an external website, or if they link to another resource on the current website, they have the current url prepended ({url}) to them. Append the parameter '?&description=(short summary of the linked webpage here that describes the content or purpose)' to all generated URLs. Also append the parameter '&previous-webpage=(short summary of the current website that the link appears on)' as final parameter. These parameters help you to figure out what to generate, so you must generate them on each link. If there are other parameters needed, make sure to combine them. Here is an example of a finished link: '<a href=\"http://www.flower-website.com/?parameter1=cart&description=Shopping cart of the town's best flower shop website&previous-webpage=Merchant directory, flower shop subpage\">Link title here</a>' Update the previous-webpage parameter to match the currently generated webpage.
That way you also get pages generated that roughly match what the fake search engine spits out, and they are thematically grouped.
Probably only a band-aid and could be implemented in a more elegant way, but it is easy enough to do like this.
Great idea. I added it and it does help a lot for the coherence. However, I see even more of the problem where the generated links on the following pages do not have the 127.0.0.1:5000 prepended to it. Is there a way to fix this to ensure all links get that?
@scalar27 🤔 I just serve the thing on localhost on port 80 and thereby no port is required and links just all work without any hassle.
You can tell flask which port to use in the main.py
, like so
if __name__ == "__main__":
app.run(host='127.0.0.1', port=80, debug=False)
print(engine.export_internet())
Alternatively I suppose you could append the port to all hrefs by modifying the _format_page
function in the ReaperEngine.py
file to include that.
When generating the URLs, generate a website description and use that description to guide the generation of the web page. Consider using GBNF for this.