olivierkes / manuskript

A open-source tool for writers
http://www.theologeek.ch/manuskript
GNU General Public License v3.0
1.74k stars 228 forks source link

Compiling over pandoc in OpenDocument, Epub or Docx Format CR missing #868

Open Johnny-English-007 opened 3 years ago

Johnny-English-007 commented 3 years ago

When I compile my book over pandoc in the above mentioned formats the compiler is not respecting my carriage return (new line) within the text. I tried to manipulate the separation commands in the dialog box, but the compiler doesn't seem to look at it at all. The text comes out in one block until it generates a new heading, which is accurate.

raptor commented 3 years ago

I am having this same issue. My system is Linux x86_64, Manuskript 0.12.0

raptor commented 3 years ago

OK Nevermind - I didn't realize that the internal format was Markdown and therefore two newlines are needed to signify a paragraph. Sorry for the noise.

TheJackiMonster commented 3 years ago

@Johnny-English-007 Does it work for you as well if you use two carriage return instead of one? Then the issue could be closed because I don't think we plan on ditching Markdown as format.

Johnny-English-007 commented 3 years ago

It's not about my needs but rather about the future of Manuskript. If you cannot guarantee the interoperability between any word processor left and right of any digital world, Manuskript is useless. Let me explain from the point of view of an author. Manuskript is ingenious for developing your story line since you can easily gain the overview over a complex story and can sort chapters and titles. But you will never hand out a *.msk file to your publisher or proof reader, but a compatible file of your work for windows, mac, linux, etc. If you have to delete thousands of empty lines by hand so your work is finally printable, you will never use Manuskript again. After I compiled my book into a .odt document I copy/pasted my 100 chapters from Manuskript to my word processor which works perfectly with markdown. Manuskript will never be as helpful as MS Word or LibreOffice Writer with all the tools they are providing for an author. But by now the Manuskript file is no more up to date and I will never use it again for this book. This is my reality as an author.

raptor commented 3 years ago

Any export I do from manuscript correctly converts the double newline to a single paragraph break in whatever other editor.

Adding a global search and replace with regex would be helpful in manuskript: \n to \n\n. As it stands I had to unzip the .msk and do the regex with another text editor, while taking care to skip the header data or manuskript would crash upon reloading it after recompression.

TheJackiMonster commented 3 years ago

@raptor We got a new global search with regex support in 0.12.0. So maybe this can be adapted to also get replace functionality.

TheJackiMonster commented 3 years ago

@Johnny-English-007 Okay, I understand. But how exactly do you think this situation could be improved? For the automation of removing empty lines or similar, I assume a search & replace dialog would help. But for example if you just want these lines to not show up in the compiled format then I assume changes in the actual exporter/compiler would be necessary.

From what I know Manuskript allows using modular exporters which seem to be fine options for many people but I guess they don't solve individual problems effectively. This is a problem because formatting the text is a quite subjective thing as author from my understanding. So I assume that's something we should improve in the future.

Johnny-English-007 commented 3 years ago

@TheJackiMonster Thank you for your open question. If you allow me to outline a vision for Manuskript: I write my book with headings an scenes. I compile my text in any common word processing format and send it to my stakeholders. They make their adjustments and sent .doc, .docx, .epub, etc. file back to me. I check the file and, reject what I don't agree and accept what is helpful and I import the file in Manuskript and I have a fully congruent version in a .msk file. These tasks go back and forth until I am satisfied and ready to print. I think the design of Manuskript covers more or less this vision. But for now this interoperability has not worked for me at all. So the consequence is that I use Manuskript as my playground for my story, but as soon playtime is over an things are getting serious I will have to finish my book in a common word processor and this can cover up to one third of the whole development process, which is a pity. I have no idea what this vision means for application development. My guts tell me, that pandoc might not be the right solution and may-be there is a better architecture to follow. I assume that your database is actually a flatfile, which makes it predestined for html, and with this we land directly in xml, which could be far more powerful for covering this interoperability. But as I said, I am just a greenhorn.

TheJackiMonster commented 3 years ago

Thanks for the input. I haven't checked all the features of Pandoc but I assume it's more a problem on our side of the interface. Theoretically Pandoc is designed to convert from any document format to any other format quite good. One problem we probably have to figure out is that exporting one document from many scenes, defining a chapter and so on, is quite easy but splitting one document into chapters and scenes is far more difficult.

So a refactoring process with reviewers and other people can break because of any issue during export and import. I guess what could help mostly is a more interactive design of the exporter and importer. Because with the limited options we have the mapping between formats can't be tweaked effectively. Also OPML and Markdown don't provide the same set of features as ODT or DOCX, so I'm unsure how much the content gets butchered.

During my time refactoring I found code that actually converted other formats like HTML for example used in scenes to Markdown (I assume for simplicity). But theoretically we could also allow other formats besides Markdown to be used in the files. I have also read some into Multimarkdown which is actually what we use (or at least we use its design of providing metadata in the files besides the actual text) and I think it would even allow including other formats as HTML, LaTeX and others inside of Markdown. There are also some older Python bindings here and maybe we could use them.