JohnSmithDev / ISFDB-Tools

Tools to query a local copy of the ISFDB database
5 stars 1 forks source link

bulk_author_gender.py blows up on Gordon Dickson's Spacepaw #17

Closed JohnSmithDev closed 5 years ago

JohnSmithDev commented 5 years ago

pdb3 ./bulk_author_gender.py -y 1969 ...

  1. M : Spacepaw (title_id=431031) written by author #1 AuthorIdAndName(id=24, name='Gordon R. Dickson') [1969-02-01] (source:wikipedia:american male short story writers) Traceback (most recent call last): File "./author_gender.py", line 153, in get_author_gender_from_ids_and_then_name_cached return gagfiatn_cache[cache_key] KeyError: (None, None) ... -> (type(author_id))) (Pdb) author_id (Pdb) name (Pdb) print(name) None (Pdb) u

    /proj/isfdb_tools/author_gender.py(155)get_author_gender_from_ids_and_then_name_cached() -> x = get_author_gender_from_id_and_then_name(conn, author_ids, name) (Pdb) u /proj/isfdb_tools/bulk_author_gender.py(3)() -> from collections import Counter (Pdb) row (431031, 'Spacepaw', '1969-02-00')

I don't see anything unusual about the book/author in question: http://www.isfdb.org/cgi-bin/title.cgi?431031

What does confuse me is the last line before the exception implied this book was processed correctly. Is there maybe a second bogus author record?

JohnSmithDev commented 5 years ago

Aha, my suspicion was correct:

(Pdb) authors [AuthorIdAndName(id=24, name='Gordon R. Dickson'), AuthorIdAndName(id=None, name=None)]

Still dunno where that second one is coming from. (This occurred after processing every novel up to 1969, and probably most of them after then.)

Aha #2 MariaDB [isfdb]> select * from canonical_author where title_id = 431031; +--------+----------+-----------+-----------+ | ca_id | title_id | author_id | ca_status | +--------+----------+-----------+-----------+ | 505761 | 431031 | 68421 | 1 | | 502481 | 431031 | 19035 | 1 | +--------+----------+-----------+-----------+ 2 rows in set (0.00 sec)

MariaDB [isfdb]> select * from authors where author_id in (68421, 19035); +-----------+------------------+------------------+-------------------+------------------+------------------+---------+------------------+--------------+-------------+---------------+--------------+--------------------+-----------------+-----------------+-------------+ | author_id | author_canonical | author_legalname | author_birthplace | author_birthdate | author_deathdate | note_id | author_wikipedia | author_views | author_imdb | author_marque | author_image | author_annualviews | author_lastname | author_language | author_note | +-----------+------------------+------------------+-------------------+------------------+------------------+---------+------------------+--------------+-------------+---------------+--------------+--------------------+-----------------+-----------------+-------------+ | 68421 | Gordon Dickson | NULL | NULL | NULL | NULL | NULL | NULL | 5527 | NULL | 1 | NULL | 4914 | Dickson | 17 | NULL | +-----------+------------------+------------------+-------------------+------------------+------------------+---------+------------------+--------------+-------------+---------------+--------------+--------------------+-----------------+-----------------+-------------+ 1 row in set (0.00 sec)

I'll put a "sticking plaster" bodge on this in my code, and log a low-level bug in ISFDB.