macmillanpublishers / bookmaker

Macmillan's Bookmaker tool
ISC License
27 stars 2 forks source link

urls with bracket characters [ ] cause epubcheck to fail #154

Closed mattretzer closed 7 years ago

mattretzer commented 7 years ago

I was able to work around this in a manuscript by replacing the brackets with their respective url encoded characters: %5B and %5D With these the links still work and the epub passes epubcheck. But we should add this fix/workaround to bookmaker, either in epubmaker.rb or epubmaker_preprocessing wacky file: https://www.dropbox.com/sh/v8qk02keyg7xrdr/AAByKlcb890TPXap3mpBOnx6a?dl=0

nelliemckesson commented 7 years ago

Yep. This should actually happen as part of htmlmaker, at the end of the process. We should add a step to fix urls that are improperly encoded. This can be either a Ruby method, or a standalone JavaScript function (similar to parts.js, inlines.js, etc.), that we can add to as needed, when we encounter other possible encoding problems in URLs. We'll want to grab every span.spanhyperlinkurl and replace bad characters in the link src and in the span text with the correct encoding.

mattretzer commented 7 years ago

This issue came up again with this wacky file below. Notably the pdf was messed up too: Resulting pdf has 'missing.jpg' scattered throughout.. brackets likely being interpreted as part of a regex. https://www.dropbox.com/sh/k6qmsbzulfodlta/AACXx8-JaJ9Ez9nB1kauXDCba?dl=0