rnkn / fountain-mode

Emacs major mode for screenwriting in Fountain plain-text markup
https://fountain-mode.org
GNU General Public License v3.0
391 stars 16 forks source link

export to epub with embedded reference pagination #103

Closed sten0 closed 5 years ago

sten0 commented 5 years ago

Continuing from https://github.com/rnkn/fountain-mode/issues/96#issuecomment-517298094

Proper pagination is essential for production, and really even for sending out a script, because people will only ever say "on page 43, where James comes in...".

Yup :-)

I have a Kobo too, and spent some time trying to reflow scripts into ePub or HTML or RTF... but in the end the best thing I found was to use pdfcrop to just crop down to the page's text body. The text was still small, but not uncomfortably so.

Sadly I've found pdfcrop's results to be too small to work with...plus no absolute pagination references.

While I think I have an issue here for ePub export, it probably won't be something I'll implement. The package is FSF now, so there's more chance someone with a better understanding of the ePub format will implement something.

Well, the whole point of proposing RST in #96 was that you don't need to worry about ePub internals... We already have Pandoc, and Sphinx, and rst2epub, et al for that. I'm not familiar with Pandoc as a markdown language or intermediary format, but it might be more suitable than RST.

The export-to-epub function sequence could be:

  1. Easy: Convert between markdown formats (Fountain to RST)
  2. Moderate: Generate absolute page-count using a yet-to-be-determined method (other screenplay writing applications can predict page count and reading time, so this is definitely possible). If the Postcript-exported result is the most reliable, then: Get the page # from the PS file, for each page, and also get the last line of the page, and also a preceding block of text of a yet-to-be-determined size (maybe 20%)?
  3. Easy: Pattern search for this large block of text in the RST and insert an RST PageBreak (See #96 for links) after the last word of this block of text, also splitting the line in the RST file if necessary. The large pattern search is to mitigate against false-positives. (eg: a character named Jaime says "Oh No!" on a single line many times on one page) It might also be necessary to embed something like page ###Page### right before the PageBreak.
  4. Call an external program to do the rst2epub conversion and print the epub filename location to the minibuffer and Messages as a side-effect. From what I've read, most of the better examples of these programs will respect the PageBreak and canonical page numbering, and any that don't should have a bug filed against them.
sten0 commented 5 years ago

P.S. This could also be solved from the Pandoc side if someone filed a bug requesting fountain markdown support. [edit: well...mostly... I'm not sure if/how Pandoc could/would handle embedded absolute pagination reference]

rnkn commented 5 years ago

Hmm I think we have different idea of what an easy path entails... going via RST and Pandoc... seems difficult. (Tbh, once the PostScript export is solid I'd like to remove to LaTeX export, because it is both too complex and doesn't work well enough.)

Fountain Mode has an internal export engine that would be able to export to ePub pretty easily, it's just a matter of someone writing a working template for it. Given that ePub is just an augmentation of HTML, some of the work is already done. I can see the value in reading scripts in ePub, but it's going to be at least 2 years down the track before I'd be able to spend any time on it.

The HTML template is mobile responsive, so in theory is should work on a Kobo, although I think I tested it and found the single HTML page too big for the Kobo's memory.

What do you mean about pdfcrop not having page reference? It's just the PDF, so the page = the page. So I guess I've misunderstood what you mean by absolute page reference.

rnkn commented 5 years ago

I just tested exported from fountain -> HTML then converting to ePub via Pandoc. I needed to save an external CSS stylesheet and remove a couple of whitespace: pre-wrap lines, but the results work fine. Here's the stylesheet: https://gist.github.com/rnkn/9e7e57c37f0f49979d2e7028b2ef8a4b

As for having some sort of absolute page count in ePub, I can't see the value in that over just using PDFs so I'll class that as outside the scope of the project.