TryGhost / migrate

MIT License
46 stars 19 forks source link

Substack - does not migrate/download podcast episodes #971

Closed jonaharagon closed 8 months ago

jonaharagon commented 10 months ago

I'm using the CLI migrate tool on my Substack and podcast audio doesn't get scraped and added to posts. I see code which indicates that audio cards should be generated for podcast episodes, but I don't see that happening.

Is there a way to get the scraper to download podcast MP3 files, and also set the og_description field to the MP3 location on the imported posts, to make the importer compatible with the Wave official podcast theme's RSS template?

PaulAdamDavis commented 8 months ago

Hi @jonaharagon, thanks for reporting. The posts.csv file in a Substack export has a column for podcast_url. In the middle of 2022, this started to be left blank, even if the post in question was a podcast. Recently, they updated the HTML for these pages so it now includes an <audio> element, which we now scrape and use if available. The AssetScraper tasks will download these files.

This is fixed and released in @tryghost/migrate@0.40.0

If you want these URLs added to og_description to use with Wave, you'll have to manually edit these tools. While Wave is an official theme, making adjustments to the tools to support one theme is not a route I want to go down.

jonaharagon commented 8 months ago

It is not only Wave, but your official recommendation to add podcast support to any theme: https://ghost.org/tutorials/custom-rss-feed/

But I understand, thank you for fixing this functionality 👍