JohnSmithDev / ISFDB-Tools

Tools to query a local copy of the ISFDB database
5 stars 1 forks source link

Many titles are not picked up by #4

Open JohnSmithDev opened 5 years ago

JohnSmithDev commented 5 years ago

get_title_details() - and its close relation get_title_id(), which should ultimately be refactored to use the same code - fails to find certain books in the titles table. I'm pretty sure there are multiple issues for this:

Here are some examples (including invocations that pick them up) for reference:

isfdb_tools $ ./award_series.py -w clarke -y 1950-2019 | grep Failed WARNING:root:Failed to get details for George Turner/The Sea and the Summer WARNING:root:Failed to get details for Stanislaw Lem/Fiasco WARNING:root:Failed to get details for Richard Grant/Rumours of Spring WARNING:root:Failed to get details for Michael Bishop/The Secret Ascension^Philip K. Dick Is Dead, Alas WARNING:root:Failed to get details for Ian Watson/The Whores of Babylon WARNING:root:Failed to get details for Mike Resnick/Ivory: A Legend of Past and Future WARNING:root:Failed to get details for Mischa/Red Spider, White Web WARNING:root:Failed to get details for Marge Piercy/Body of Glass WARNING:root:Failed to get details for Stephen Baxter/Time: Manifold 1 WARNING:root:Failed to get details for Mary Gentle/Ash: A Secret History WARNING:root:Failed to get details for David Brin/Kil'n People WARNING:root:Failed to get details for Greg Bear/Darwin’s Children WARNING:root:Failed to get details for Paul Kincaid/untitled WARNING:root:Failed to get details for Marcel Theroux/Far North: A Novel WARNING:root:Failed to get details for Gwyneth Jones/Spirit WARNING:root:Failed to get details for M. R. Carey/The Girl with All the Gifts WARNING:root:Failed to get details for Emmi Itäranta/Memory of Water WARNING:root:Failed to get details for Claire North/The First Fifteen Lives of Harry August

isfdb_tools $ ./award_series.py -W "Nebula Award" -C "Novel" -y 1950-2019 | grep Fail WARNING:root:Failed to get details for Philip K. Dick/Dr. Bloodmoney^Dr. Bloodmoney, or How We Got Along After the Bomb WARNING:root:Failed to get details for Theodore L. Thomas+Kate Wilhelm/The Clone WARNING:root:Failed to get details for James White/Open Prison^The Escape Orbit WARNING:root:Failed to get details for Philip K. Dick/Do Androids Dream of Electric Sheep?^Blade Runner WARNING:root:Failed to get details for Kurt Vonnegut, Jr./Slaughterhouse-Five or, The Children's Crusade WARNING:root:Failed to get details for D. G. Compton/The Steel Crocodile^The Electric Crocodile (UK) WARNING:root:Failed to get details for Ursula K. Le Guin/The Dispossessed WARNING:root:Failed to get details for Larry Niven+Jerry Pournelle/The Mote in God's Eye WARNING:root:Failed to get details for Alfred Bester/The Computer Connection^The Indian Giver WARNING:root:Failed to get details for Italo Calvino/Invisible Cities WARNING:root:Failed to get details for Larry Niven+Jerry Pournelle/Inferno WARNING:root:Failed to get details for Jack Vance/Lyonesse: Suldrun's Garden^Lyonesse (1984 UK) WARNING:root:Failed to get details for George Turner/Drowning Towers^The Sea and the Summer WARNING:root:Failed to get details for Mike Resnick/Ivory: A Legend of Past and Future WARNING:root:Failed to get details for William Gibson+Bruce Sterling/The Difference Engine WARNING:root:Failed to get details for Kevin J. Anderson+Doug Beason/Assemblers of Infinity WARNING:root:Failed to get details for Robert J. Sawyer/The Terminal Experiment^Hobson's Choice WARNING:root:Failed to get details for Nancy Kress/Beggars and Choosers WARNING:root:Failed to get details for Paul Park/Celestis^Coelestis (UK 1993) WARNING:root:Failed to get details for Connie Willis/To Say Nothing of the Dog WARNING:root:Failed to get details for Susanna Clarke/Jonathan Strange & Mr. Norrell WARNING:root:Failed to get details for Katherine Addison/The Goblin Emperor WARNING:root:Failed to get details for Cixin Liu/The Three-Body Problem

isfdb_tools $ ./award_series.py -w "british sci" -C "Best Novel" -y 1950-2019 | grep Fail WARNING:root:Failed to get details for Christopher Priest/Inverted World^The Inverted World WARNING:root:Failed to get details for William Gibson+Bruce Sterling/The Difference Engine WARNING:root:Failed to get details for Ian McDonald/Hearts, Hands and Voices^The Broken Land WARNING:root:Failed to get details for Ian McDonald/Necroville^Terminal Café WARNING:root:Failed to get details for Ian McDonald/Chaga^Evolution's Shore (US 1995) WARNING:root:Failed to get details for Neal Stephenson+J. Frederick George/Interface WARNING:root:Failed to get details for Mary Gentle/Ash: A Secret History WARNING:root:Failed to get details for Jon Courtenay Grimwood/Effendi: The Second Arabesk WARNING:root:Failed to get details for Susanna Clarke/Jonathan Strange & Mr. Norrell WARNING:root:Failed to get details for Claire North/The First Fifteen Lives of Harry August

isfdb_tools $ ./award_series.py -W "Hugo Award" -C "Best Novel" -y 1950-2019 | grep Fail WARNING:root:Failed to get details for Mark Clifton+Frank Riley/They'd Rather Be Right WARNING:root:Failed to get details for Poul Anderson/We Have Fed Our Seas^The Enemy Stars WARNING:root:Failed to get details for Robert A. Heinlein/Have Spacesuit -- Will Travel WARNING:root:Failed to get details for Robert Sheckley/Time Killer^Immortality, Inc. (Delivered) WARNING:root:Failed to get details for Gordon R. Dickson/Dorsai!^The Genetic General WARNING:root:Failed to get details for Murray Leinster/The Pirates of Ersatz^The Pirates of Zan WARNING:root:Failed to get details for (Randall Garrett+Laurence M. Janifer)^Mark Phillips/That Sweet Little Old Lady^Brain Twister WARNING:root:Failed to get details for Harry Harrison/Sense of Obligation^Planet of the Damned WARNING:root:Failed to get details for Marion Zimmer Bradley/Sword of Aldones WARNING:root:Failed to get details for Clifford D. Simak/Way Station^Here Gather the Stars WARNING:root:Failed to get details for Cordwainer Smith/The Planet Buyer^The Boy Who Bought Old Earth WARNING:root:Failed to get details for John Brunner/The Whole Man^The Telepathist WARNING:root:Failed to get details for Roger Zelazny/...And Call Me Conrad^This Immortal WARNING:root:Failed to get details for Kurt Vonnegut, Jr./Slaughterhouse-Five or, The Children's Crusade WARNING:root:Failed to get details for Ursula K. Le Guin/The Dispossessed WARNING:root:Failed to get details for Larry Niven+Jerry Pournelle/The Mote in God's Eye WARNING:root:Failed to get details for Christopher Priest/Inverted World^The Inverted World WARNING:root:Failed to get details for Larry Niven+Jerry Pournelle/Inferno WARNING:root:Failed to get details for Alfred Bester/The Computer Connection^The Indian Giver WARNING:root:Failed to get details for Larry Niven+Jerry Pournelle/Lucifer's Hammer WARNING:root:Failed to get details for Larry Niven+Jerry Pournelle/Footfall WARNING:root:Failed to get details for (****)/No Award WARNING:root:Failed to get details for Nancy Kress/Beggars and Choosers WARNING:root:Failed to get details for Robert J. Sawyer/The Terminal Experiment^Hobson's Choice WARNING:root:Failed to get details for Connie Willis/To Say Nothing of the Dog WARNING:root:Failed to get details for David Brin/Kiln People^Kil'n People WARNING:root:Failed to get details for Susanna Clarke/Jonathan Strange & Mr. Norrell WARNING:root:Failed to get details for John Scalzi/Old Man’s War WARNING:root:Failed to get details for Michael Flynn/Eifelheim WARNING:root:Failed to get details for Mira Grant/Feed WARNING:root:Failed to get details for James S. A. Corey/Leviathan Wakes WARNING:root:Failed to get details for Mira Grant/Deadline WARNING:root:Failed to get details for Mira Grant/Blackout WARNING:root:Failed to get details for Mira Grant/Parasite WARNING:root:Failed to get details for Robert Jordan+Brandon Sanderson/The Wheel of Time (series) WARNING:root:Failed to get details for Cixin Liu/The Three-Body Problem WARNING:root:Failed to get details for Katherine Addison/The Goblin Emperor WARNING:root:Failed to get details for Cixin Liu/Death's End

May be easier to split this into multiple issues...

NB: Relevant code hasn't yet been pushed to GitHub in case anyone else is looking at this and wondering what I'm talking about...

JohnSmithDev commented 5 years ago

Some of these are now fixed, but I just encountered a couple of examples of what I think might be a new problem:

publication_history.AmbiguousResultsError: Search for Stephen Baxter/Raft had 2 matches publication_history.AmbiguousResultsError: Search for Neal Stephenson/Quicksilver had 2 matches

This is after adding in code that should filter for NOVEL (or other relevant type). Possibly it's just reporting a similar known issue in a different way, will have to look into it.

JohnSmithDev commented 5 years ago

Aargh - turns out award_titles_report maps between award_titles and titles.

Much of the work I've done has thus been a waste of time for my immediate requirements, but it should still be of use in other contexts e.g. when I have to turn external references to authors+books into database refs.

JohnSmithDev commented 5 years ago

BTW, Baxter/Raft is an issue with Clarke Award weirdness causing the NOVEL filter not to be applied - now fixed via config.

Stephenson/Quicksilver is a weird one, with 2 versions of the novel: "Volume Presentations" vs "Book Presentations" http://www.isfdb.org/cgi-bin/ea.cgi?429 - I don't see how that could be fixed as things stand, but obv. the table mentioned in the previous comment means we should be able to circumvent that issue for now.

JohnSmithDev commented 5 years ago

I think all issues are resolved know except for "The Wheel of Time (series)/Robert Jordan+Brandon Sanderson" - Hugos 2015ish. I think that may be best solved by stripping off "(series)" and any other known parenthesized crap.