Closed tarzanboy76 closed 4 years ago
Current behaviour is to display the current query (as passed to TPBD) against the site, date, and titles returned from TPBD. We can't always add site/date/title to the query, because we can't assume those are set within Stash. If those fields are set, I recommend you disable the "scrape_with_filenames" flag to cause the query to include the date/site/title. Alternatively, if you have scenes sorted into folders, try the "test" branch build, which supports the "dirs_in_query" flag to add parent folder names to the query.
Performers in the query (for non-filename scraping) and in the ambiguous results list is possible, but a bit more tricky. I hadn't seen a use case that needs it yet, but let me know if the above doesn't resolve your issue.
Most of my files have pretty good quality titles / studios / dates... I'm mostly scraping to confirm things, plus add descriptions and tags. So I did disable scrape_with_filenames. The problem is that searching by title when it is something generic like 'Hardcore' makes it difficult to know what entry was actually being searched... so when you get a response back from TPDB, I can't confirm which is the correct match.
I'm not a python coder, but I made the follow tweak on a local copy to help me;
if parse_with_filename:
try:
if re.search(r'^[A-Z]:\\', scene['path']): #If we have Windows-like paths
file_name = re.search(r'^[A-Z]:\\(.+\\)*(.+)\.(.+)$', scene['path']).group(2)
else: #Else assume Unix-like paths
file_name = re.search(r'^\/(.+\/)*(.+)\.(.+)$', scene['path']).group(2)
except Exception:
print("Error when parsing filename: "+scene['path'])
return
if clean_filename:
file_name = scrubFileName(file_name)
scrape_query = file_name
print("Grabbing Data For: "+scrape_query
else:
scrape_query = scene_data['title']
print("Grabbing Data For: " + scene['title'] +' ['+scene['studio']['name']+'] ('+scene['date']+')')
scraped_data = scrapeMetadataAPI(scrape_query)
This definitely helped.
Ah! I see the issue now. The "Grabbing Data for" line happens before the built-in disambiguation for titles (where the script adds the studio and date to try to disambiguate). It should happen after, so that it reflects the updated scrape_query. The end result should be the same as your edit, just a bit cleaner code wise.
I will edit to reflect this. Thanks for the feedback!
At the moment, the interface displays the title of the current scene being searched. If it has a fairly generic title (eg, "Hardcore") it can be difficult to match it against the options returns from TPDB. It would help to also display the site/studio name, date and ideally the performers.
Similar, the list returned from TPDB doesn't include the list of performers... including this might make it helpful for matching.