PennyDreadfulMTG / Penny-Dreadful-Tools

A suite of tools for the Penny Dreadful MTGO community
https://pennydreadfulmagic.com
MIT License
41 stars 28 forks source link

tappedout scraper dies with an error about something not being iterable #5665

Closed vorpal-buildbot closed 5 years ago

vorpal-buildbot commented 6 years ago

Reported on Discord by bakert#2193

bakert commented 6 years ago
Fetching https://tappedout.net/api/collection/collection:deck/pd-goblin-tribal/ (cache ok)
Traceback (most recent call last):
  File "run.py", line 120, in <module>
    run()
  File "run.py", line 34, in run
    task(sys.argv)
  File "run.py", line 80, in task
    s.scrape() # type: ignore
  File "/home/discord/decksite/decksite/scrapers/tappedout.py", line 24, in scrape
    raw_deck.update(fetch_deck_details(raw_deck)) # type: ignore
TypeError: 'NoneType' object is not iterable
bakert commented 6 years ago

taylors was trying to link his tappedout account so worth giving them a heads up when this is fixed.

bakert commented 6 years ago

On local I get a different issue. We never commit our transaction because failures in add_cards cause us to skip ever deeper into begins without commits or rollbacks.

bakert commented 6 years ago
Fetching http://tappedout.net/mtg-decks/05-06-18-pd-infect/?fmt=txt (cache ok)
Before BEGIN ([])
After BEGIN (['add_cards'])
[2018-10-31 06:56:53,066] WARNING in logger: Skipping 05-06-18-pd-infect because of Did not find any cards looking for `Wax`
Fetching http://tappedout.net/mtg-decks/gw-populate-aggro/?fmt=txt (cache ok)
Before BEGIN (['add_cards'])
After BEGIN (['add_cards', 'add_cards'])
…
bakert commented 6 years ago

Handled the bad data with a rollback in 8a754cdc. Tappedout scraper now runs to completion on local albeit skipping decks with split cards and giving the message "Cache entry deserialization failed, entry ignored" a bunch.

bakert commented 6 years ago

https://tappedout.net/api/collection/collection:deck/pd-goblin-tribal/ is a 404 which might explain the error seen on prod.

bakert commented 5 years ago

This works now on local and prod so closing.