JupiterBroadcasting / show-scraper

Scraper written in python to convert episodes hosted on Fireside or jupiterbroadcasting.com into Hugo Markdown files
5 stars 5 forks source link

People profiles update logic - if updated on Fireside, how are they scraped or not? #13

Open gerbrent opened 2 years ago

gerbrent commented 2 years ago

I conducted an experiment that led to some questions:

Given a host/guest profile is updated on Fireside..

Currently:

Steps to reproduce:

Expected Behaviour

Example: Alecks Gates

Typo in his description was fixed on Fireside just now, from The "Offical" Podcasting 2.0 consultant. to The "Official" Podcasting 2.0 consultant. - i.e. Offical -> Official

Hugo profile still contains typo: "bio": "The \"Offical\" Podcasting 2.0 consultant.",

https://github.com/JupiterBroadcasting/jupiterbroadcasting.com/blob/e061ece2275ce0b3010223ba0be8eba0986e615d/data/people/agates.json#L5

kbondarev commented 2 years ago

Okay. I’ll modify the scraper to override hosts, guests, and sponsors. Currently it also overrides the latest episode of each show. Is it safe to assume older shows won’t be modified? Should it maybe override the 3 or even 10 latest episodes of each show? I did see a change recently that after some episode was published the link to the meetup page was modified.

maybe it would be good to run it over all the shows one last time before WordPress shutdown.

kbondarev commented 2 years ago

Also, what about avatar files? Override them as well or no?