PennyDreadfulMTG / Penny-Dreadful-Tools

A suite of tools for the Penny Dreadful MTGO community
https://pennydreadfulmagic.com
MIT License
41 stars 28 forks source link

Another Gatherling deck has changed ID since initial scrape. #7181

Closed bakert closed 4 years ago

bakert commented 4 years ago
Error running task ['run.py', 'scraper', 'all']

Reported on CLI by CLI

InvalidDataException Unable to find deck with gatherling id '86198'
Stack Trace:

Python traceback

  File "run.py", line 129, in <module>
    run()
  File "run.py", line 71, in task
    run_all_tasks(module)
  File "run.py", line 116, in run_all_tasks
    s.scrape() # type: ignore
  File "/home/discord/decksite/decksite/scrapers/gatherling.py", line 28, in scrape
    i = tournament(url, name)
  File "/home/discord/decksite/decksite/scrapers/gatherling.py", line 56, in tournament
    n = add_decks(dt, competition_id, final, s)
  File "/home/discord/decksite/decksite/scrapers/gatherling.py", line 103, in add_decks
    add_ids(matches, ds)
  File "/home/discord/decksite/decksite/scrapers/gatherling.py", line 263, in add_ids
    m['right_id'] = lookup(m['right_identifier']).id if m['right_identifier'] else None
  File "/home/discord/decksite/decksite/scrapers/gatherling.py", line 260, in lookup
    raise InvalidDataException("Unable to find deck with gatherling id '{0}'".format(gatherling_id))

Exception_hash: a265ba757d625959b159676d3e157a6519303317

Labels: CLI; InvalidDataException
bakert commented 4 years ago

This happens enough that I'm looking to put a permanent fix in place rather than just patch up.

bakert commented 4 years ago

ok this might be something different.

MariaDB [decksite]> MariaDB [decksite]> SELECT * FROM deck WHERE identifier = '86198';
+-------+-----------+-----------+------------+-----------------------------+--------------+--------------+----------------+----------------------------------------------------+--------------+--------------+---------------+-------+---------------+---------------------+--------+------------------------------------------+---------+----------+
| id    | person_id | source_id | identifier | name                        | created_date | updated_date | competition_id | url                                                | archetype_id | resource_uri | featured_card | score | thumbnail_url | small_thumbnail_url | finish | decklist_hash                            | retired | reviewed |
+-------+-----------+-----------+------------+-----------------------------+--------------+--------------+----------------+----------------------------------------------------+--------------+--------------+---------------+-------+---------------+---------------------+--------+------------------------------------------+---------+----------+
| 70841 |      3976 |         2 | 86198      | Deck - Mono Blue Flash Back |   1586070000 |   1586086104 |           3175 | https://gatherling.com/deck.php?mode=view&id=86198 |          317 | NULL         | NULL          |  NULL | NULL          | NULL                |      9 | a608b560c572cb1407ce4a02f6fb71e4f6f06179 |       0 |        1 |
+-------+-----------+-----------+------------+-----------------------------+--------------+--------------+----------------+----------------------------------------------------+--------------+--------------+---------------+-------+---------------+---------------------+--------+------------------------------------------+---------+----------+
1 row in set (0.05 sec)
MariaDB [gatherli_gatherling]> SELECT * FROM decks WHERE id = 86198;
+--------------+-------+-----------------------------+------------+-------------+----------------+-------+-------+------------------------------------------+------------------------------------------+------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+
| archetype    | id    | name                        | playername | deck_colors | format         | tribe | notes | deck_hash                                | sideboard_hash                           | whole_hash                               | deck_contents_cache                                                                                                                                                                                                                                                               | created_date        |
+--------------+-------+-----------------------------+------------+-------------+----------------+-------+-------+------------------------------------------+------------------------------------------+------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+
| Unclassified | 86198 | Deck - Mono Blue Flash Back | Catacuio   | u           | Penny Dreadful | NULL  |       | 2323ff3ede6046ffc1e655d503d03340d774643c | 36fadf61ed3592d078ee13770c7b15de89999f03 | 8ba6b45746c3adb341c05c804ab641cea8a6a82a | Augur of Bolas|Brineborn Cutthroat|Disperse|Dissipate|Essence Capture|Hypnotic Sprite|Island|Mana Leak|Mystic Sanctuary|Negate|Omen of the Sea|Opt|Pteramander|Syncopate|Word of Undoing|Cerulean Drake|Flashfreeze|Gainsay|Plumeveil|Saheeli, Sublime Artificer|Scrabbling Claws | 2020-04-05 02:54:21 |
+--------------+-------+-----------------------------+------------+-------------+----------------+-------+-------+------------------------------------------+------------------------------------------+------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+
1 row in set (0.00 sec)
bakert commented 4 years ago

I can't repro this and it isn't happening any more. It's confusing.

Temporary network conditions.

bakert commented 4 years ago

Well whatever is happening here is actually still happening every hour.

bakert commented 4 years ago

So it was that Gatherling changed the id of the deck post-scrape and this time it even re-used the ID for a different deck. Yikes. I fixed up pd db but we should really fix this in Gatherling.

stash86 commented 4 years ago

86195 created on 2020-04-05 02:47:07 86206 created on 2020-04-05 07:34:24

86205 created on 2020-04-05 06:29:38 86207 created on 2020-04-05 08:36:01

86205 and 86207 are PD Sunday decks. Just checked with spe3. They didn't register for PD Sunday.

so the only way this can happen is the system create new instance of 86195, then somehow assign the id as 0, then call the save function.

jgabrielygalan commented 4 years ago

2:47 is 13 minutes before APAC 15.11 started. 7:34 is around 10 minutes after it ended. Is there any processing of the decks after or around the end of the tournament?