jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.17k stars 3.36k forks source link

Add flag not include audio/video resources in epub #5644

Closed NightMachinery closed 5 years ago

NightMachinery commented 5 years ago

I use epubs for reading on a Kindle, and the overhead of downloading audio/video links in the input HTML, and the size overhead they cause, is not acceptable for me. A flag to disable this (--no-audio and --no-video perhaps?) would be very welcome. See also: https://github.com/jgm/pandoc/issues/2473

NightMachinery commented 5 years ago

I actually noticed that my problem was because my html files didn't have any extensions, and adding -f html automatically disabled audio inclusion. I think adding a warning that input is extensionless is a good idea though.

PS: My code used process substitution which left out the extension:

html2epub() {
    pandoc --toc -s -f html <(map '

 <h1>$(strip $1 ".html")</h1>

 $(cat $1)' "${@:3}") --epub-metadata <(ec "<dc:title>$1</dc:title> <dc:creator> $2 </dc:creator>") -o "$1.epub"
} 
mb21 commented 5 years ago

From the MANUAL:

If no input file is specified (so that input comes from stdin), or if the input files’ extensions are unknown, the input format will be assumed to be Markdown.

This is well documented, and adding a warning would probably annoy people using pandoc that way.

jgm commented 5 years ago

I've added a warning for the case where a filename is given but the extension is unknown. But I agree, when input is from stdin, we don't want a warning.