janw / podcast-archiver

Archive all your favorite podcasts
MIT License
123 stars 21 forks source link

What exactly is the difference between running with and without `--update`? #118

Closed mainrs closed 1 month ago

mainrs commented 8 months ago

From a user's point of view (mine), running without update should just download non-existing files. The presence of update makes me assume that this is not the case. The program actually downloads all files when the flag is not passed.

Passing the flags would only download non-existing files.

It would be nice if you could explain the behavior! Thank you!

janw commented 8 months ago

Hey! Indeed the option is not super useful with most podcast feeds. It's mostly geared towards those with large numbers of episodes that are spread across several pages: If, for example, all but the latest episode have been downloaded already, the archiver would only fetch the latest page of the feed and, skip fetching any further pages upon reaching the first existing file. So essentially it attempts to avoid more requests than are necessary to update the archive.

I hope that makes sense. In any case thank you very much for the feedback. I have been contemplating if that option is still worth keeping around; Back when I first implemented it, I might have over-estimated the number of paginated feeds out there.

mainrs commented 8 months ago

What I noticed is that, when not using the flag, my sync process takes over 30min. I have downloaded 10 feeds in total, which 4 of them having over 1000 episodes each.

When using the flag, the update process takes less than one minute, and I get only the "difference".

Any idea what is happening? Does the CLI check every single file exists locally when not using the flag then?

janw commented 8 months ago

That sounds like paginated feeds to me, and it's exactly what the option was made for. And you're right, without --update, each episode is checked to exist, all the way back through the entire feed.

Do you mind sharing any of the very large feeds so I can include them for testing?