Closed giubaru closed 7 months ago
Am I missing something or is the desired format and actual format identical?
No, bacause is returning literally the json with the single quotes
Ah so the ` marks are in the response. Got it. Probably an easy thing to check for.
gpt.py already has a failsafe that takes care of incorrectly formatted replies from chatgpt.
print(colored("[*] GPT returned an unformatted response. Attempting to clean...", "yellow"))
# Attempt to extract list-like string and convert to list
match = re.search(r'\["(?:[^"\\]|\\.)*"(?:,\s*"[^"\\]*")*\]', response)
if match:
try:
search_terms = json.loads(match.group())
except json.JSONDecodeError:
print(colored("[-] Could not parse response.", "red"))
return []
Unless this "bug" prevents the script from running sucessfully it's not an issue.
Yes, but in this case the re expression is not working:
Take a look here:
import json, re
def get_search_terms() -> list[str]:
response = '''\
```json
[
"Supreme Court ruling",
"Presidential immunity",
"January 6 insurrection",
"Judicial scrutiny",
"High court decision",
"Oval Office"
]```'''
# Parse response into a list of search terms
search_terms = []
try:
search_terms = json.loads(response)
if not isinstance(search_terms, list) or not all(isinstance(term, str) for term in search_terms):
raise ValueError("Response is not a list of strings.")
except (json.JSONDecodeError, ValueError):
print("[*] GPT returned an unformatted response. Attempting to clean...")
# Attempt to extract list-like string and convert to list
match = re.search(r'\["(?:[^"\\]|\\.)*"(?:,\s*"[^"\\]*")*\]', response)
if match:
try:
search_terms = json.loads(match.group())
except json.JSONDecodeError:
print("[-] Could not parse response.", "red")
return []
# Let user know
print(f"\nGenerated {len(search_terms)} search terms: {', '.join(search_terms)}")
# Return search terms
return search_terms
print(get_search_terms())
This regex should fix it in this particular case, but I don't know if it breaks any other formatting issues:
Also it keeps the white spaces, not sure if this causes trouble further down.
match = re.search(r'\[\s*"(?:[^"\\]|\\.)*"(?:,\s*"[^"\\]*")*\s*\]', response)
Couldn't one just hardcode a .replace("```","")
?
Or would that also cause more problems?
The problem are the additional whitespaces. The regex I posted takes care of that. Please test it.
Dude what is with closing stuff and not commenting on it
Dude what is with closing stuff and not commenting on it
Most changes do not make sense, and here, I thought the issue was resolved, as radry mentioned his Regex works.
The issue is only resolved when it was tested and added to the code. I didn't open a pull request and you didn't add it either. I only tested the regex with an online regex tool, not the code itself.
The issue is only resolved when it was tested and added to the code. I didn't open a pull request and you didn't add it either. I only tested the regex with an online regex tool, not the code itself.
Fair enough, I forgot to add to the README that I will not be responding to any issues anymore (already added regarding PRs).
I will not be responding to any issues anymore (already added regarding PRs).
So this project is abandoned?
I will not be responding to any issues anymore (already added regarding PRs).
So this project is abandoned?
probably. i already have a fork that's ready ;)
I will not be responding to any issues anymore (already added regarding PRs).
So this project is abandoned?
Yes, at least for now.
Describe the bug
Sometimes the
get_search_terms
function is returning something like this:To Reproduce Just try using more complex subjects.
Expected behavior Should return only in this format ["search term 1", "search term 2", "search term 3"]
Desktop (please complete the following information):
Additional context Changing the prompt is enough to fix it.