TechAndCheck / zenodotus

MIT License
4 stars 1 forks source link

Error thrown while scraping Facebook #165

Closed reefdog closed 3 weeks ago

reefdog commented 2 years ago

I ran a Facebook scrape locally, and kept getting this error from Sidekiq:

2022-03-31T21:34:35.915Z pid=35137 tid=1azl class=ScraperJob jid=6fb4a8cf175812665fbebe99 ERROR: Error performing ScraperJob (Job ID: 71dee47c-654b-4803-8092-0db835c6ccdd) from Sidekiq(default) in 734.38ms: TypeError (Return value: Expected type T::Array[T::Hash[T.untyped, T.untyped]], got type TrueClass
Caller: /Users/justin/Projects/duke/zenodotus/app/media_sources/facebook_media_source.rb:26
Definition: /Users/justin/Projects/duke/zenodotus/app/media_sources/facebook_media_source.rb:61):

Each time it threw, the job was duplicated, and thus appeared again in the Active Scrapes table (on the Jobs Status page).

reefdog commented 2 years ago

Several minutes later, it's still adding new scrapes:

image
cguess commented 2 years ago

Just thinking out loud. I have redo enabled on fail for for jobs. I'm guessing this bug is causing it to be re-queued overall

reefdog commented 2 years ago

I sort of reported two things in this one issue: "Facebook scraping failed" and "Facebook scraping seems to be duplicating". The former is probably an actual issue to suss out; the latter is probably me not realizing that retry was enabled, and so the "duplicate" scrapes were the retries.

I do think this means we should find some way to indicate this the Active Scrapes table. For instance, rather than failed tries and currently-running retries as siblings in the table, maybe we recognize retries and nest/group them all together? So you get a single row for that scrape, showing just the most current one, with a link that's something like "See previous attempts" that spawns a modal (or expands the table)?

reefdog commented 2 years ago

Regardless, the retry conversation should probably be a second issue and my apologies for mixing it into this bug report.