aajanki / yle-dl

Download videos from Yle servers
https://aajanki.github.io/yle-dl/index-en.html
GNU General Public License v3.0
309 stars 51 forks source link

Series and title misinterpreted when program doesn't have a title specified in Areena #331

Closed maxminstr closed 2 years ago

maxminstr commented 2 years ago

It seems that yle-dl misinterprets ${series} and ${title} variables when audio program episode doesn't have a title specified in Areena.

In these instances yle-dl interprets series as title, and leaves ${series} variable empty.

Here is an example using audio program series called Iltasoitto:

https://areena.yle.fi/podcastit/1-63391641

https://areena.yle.fi/podcastit/1-63135148

As series variable is empty, output templates don't work correctly (I've included full command):

yle-dl https://areena.yle.fi/podcastit/1-63135148 --vfat --create-dirs --no-overwrite --output-template "YLE Radio 1 - ${series}/${timestamp} - YLE Radio 1 - ${series} - ${title} - ${program_id}"

In Windows this results to a failed download:

Output file: YLE Radio 1 - \2022-09-09T22_20 - YLE Radio 1 - - Iltasoitto - 1-63135148.mp3
file:YLE Radio 1 - \2022-09-09T22_20 - YLE Radio 1 - - Iltasoitto - 1-63135148.mp3: No such file or directory

In Linux the file gets saved but it has practically no name at all ("Yle Radio 1 -/- YLE Radio 1 - - -.mp3".

My suggestions:

  1. Please interpret series and title correctly where there is no title in Areena. Maybe use a default title like Areena, ie. date?

  2. Please consider adding a placeholder value for unavailable metadata fields: Even if this missing variable issue is resolved, it is possible that in other scenarios a variable might end up empty. Youtube-dl uses "NA" as a default value for such instances (it can be changed using flag --output-na-placeholder). This would help ensure that output templates don't cause failed downloads.

aajanki commented 2 years ago

The series/episode title detection is currently a bit of a hack. I'll see if I can make it more robust. Some way to optionally specify the placeholder value for unavailable fields also sound like a good idea.

aajanki commented 2 years ago

I have implemented --output-na-placeholder NA option for setting a replacement value for missing metadata fields. The default is still to replace missing fields with an empty string.

I also modified the title detection. Now the series title should get set correctly and date is used as the episode title like in Areena.

Note that you should use single quotes for the output template at least on Linux. Otherwise $foo gets interpreted as a shell variable and that messes up the template.

After the changes this command will download a a file YLE Radio 1 - Iltasoitto/2022-09-09T22_20 - YLE Radio 1 - Iltasoitto - pe 9.9.2022 - 1-63135148.mp3:

yle-dl https://areena.yle.fi/podcastit/1-63135148 --vfat --create-dirs --no-overwrite --output-template 'YLE Radio 1 - ${series}/${timestamp} - YLE Radio 1 - ${series} - ${title} - ${program_id}' --output-na-placeholder NA