nextcloud / text

📑 Collaborative document editing using Markdown
GNU Affero General Public License v3.0
555 stars 91 forks source link

Ability to export to: Plain text, PDF, ODT, HTML, DOCX, … #102

Open jancborchardt opened 5 years ago

jancborchardt commented 5 years ago

Use-case: You write a collaborative document like the Nextcloud polishing issue. Then when you want to copy it into a Nextcloud issue, you notice that the Markdown characters like bullets, heading #, will all be stripped.

Since copying will copy the rich text, we can’t do "plain text copying" there.

We could have an "Export" entry in the toolbar where you can export it as "Plain text" or in the future even as HTML, or PDF if needed.

rkaraba commented 5 years ago

Export to pdf would be great. Sometimes people need to print their notes.

Jeewes commented 3 years ago

I guess "PDF export" could be achieved with just a thoughtful css targeting print media. Then the user could just "print to PDF".

Currenlty that's not feasible option since printing the document isn't well optimized for print media and seems to cut content that doesn't fit to first printed page.

jjdsec commented 3 years ago

Hi, just wanted to add that I encountered this issue today when attempting to print my markdown document. I could not a find working md-to-pdf or print from md feature within nextCloud that actually worked and I was not able to get the pdf flow working either. just wanted to thank you in advance for making this possible and keeping the beautiful css of nextCloud while printing. Thanks!

mmuman commented 2 years ago

Another use-case I have: The tool I wrote to help recording my radio show currently uses the HTML export from Etherpad, and it works fine except I later have to redo all the formatting in WordPress when publishing the transcript along with the podcast because the export is not clean enough. Since we already have a NC instance installed where I already back-up the Etherpad md+html exports, I'd like to rationalize the tools and move to NC for editing the markdown directly as well. Seems I'll have to live with one more year of Etherpad… Maybe it could be done rather with a generic export_as app?

susnux commented 2 years ago

No sure if it fits your use case @mmuman but there is a "Import Markdown" plugin for wordpress, using this you can import your markdown directly (or even use the public share link of the file to "embed" it).

mmuman commented 2 years ago

It would fix half of the problem, I guess I should add a markdown import to RadioTimer…

Andreas-Kainz commented 2 years ago

As NextCloud office (cool) support pdf export, style fine-tuning for print, I would suggest to send the .MD file to the office suite where you can prepare the file for print and the user get the best results and we can use the core stuff from NextCloud.

The main technical part would be the selection of a template where the .MD file get import.

I don't know if libre office can import .MD files, but it can be a useful addition for LibO.

I can help and have a look how good the .MD import is at LibO

mmuman commented 2 years ago

At least for my HTML export needs it must be rightly available as an URL, and so not going through a GUI for templating.

Yiannis128 commented 1 year ago

So this is more important now as there might be embedded images in the MD that need to be printed as well. So there should be an export button in the toolbar that allows you to print.

Alternatively, the easier and simpler solution is to get the browser print dialog to be functional (as documents above 1-page look absolutely broken).

Uatschitchun commented 1 year ago

@Yiannis128

Alternatively, the easier and simpler solution is to get the browser print dialog to be functional (as documents above 1-page look absolutely broken).

So this seems to be a regression. I commented on the original issue where it was closed: https://github.com/nextcloud/text/issues/112#issuecomment-1630356989

jancborchardt commented 1 year ago

when you want to copy it into a Nextcloud issue, you notice that the Markdown characters like bullets, heading #, will all be stripped.

@juliushaertl is it possible to prevent this and make sure that copying actually copies things, at least useful (and non-weird-looking) things like bullets (in form of "- ")?

juliusknorr commented 1 year ago

is it possible to prevent this and make sure that copying actually copies things, at least useful (and non-weird-looking) things like bullets (in form of "- ")?

You can already copy the plain markdown using Ctrl+Shift+C since #3941

mmuman commented 1 year ago

I had another look and I found some simple enough js code to convert markdown from html that would fit my case, and the copy-paste from Text to the WordPress editor is much cleaner than that of Etherpad… The only problem now is the CORS policy preventing me to just download the file from the code. Guess I should open a ticket for that, no reason it should apply to download links.

Still would be nice to get proper HTML export someday.

juliusknorr commented 9 months ago

@blizzz Will unassign you from that one now as we haven't scheduled. Just in case you any notes worth sharing from your previous investigations, maybe you can dump them here for later reference.

blizzz commented 9 months ago
That was the last state of my research notes (folded, because longcat) # Text export ## Goals 1. Markdown (rendered) to PDF 2. Pluggable – apps my register converters 3. Extensible – other formats can be converted from or to ### Characteristics 1. Flexible – no hard requirements on the implementation of the converter tool 2. via OCS-API 3. via Capabilities - converters - types 4. Registration per RegistrationContext - Perhaps split into Renderer and Converter 5. sync (StreamResponse) and async (Push via Notification/files_notify/RPC, Pull via temporary URL) ### Thoughts - OOTB experience - native PHP PDF renderer shipped by default (with Text?) as fallback – with own parsing of Text and generation of the PHP document - native PHP PDF renderer via TipTap's `$editor.getHTML()` – bigger choice of libs. But, would only work in browser by providing the document content to convert. Not favoring this at all. - Consider to split into Renderer and Converter - rendering might be necessary to prepare a document (fetch previews etc.) - rendering most likely needs to be a step of a series. Perhaps app need to register their mime types. Markdown flavors may play a role. - Over all conversion steps (preliminary): 1. get "raw" document 2. fetch previews 3. fetch other NC specific things (here be dragons) 4. prepare temporary final document for conversion 5. run converter - Flow: - client requests conversion via OCS API, in async or sync fashion - In sync fashion, connection stays/is kept open until conversion is completed, resulting in the final download link. Optional, nice to have, but not the recommended or desired option. - In async fashion, connection returns an process identifier. Push will notify clients (does that work with web ui?), also a status endpoint will provide information, at least "in progress", "download url", "expired/not found - RPC/Webhooks can be considered, too - ⚠️ Keep UX with big files in mind! Avoid timeouts, inform about progress, handle error cases. - 💡 Export whole Collective to a single PDF - actually already exists as feature via print preview - print to PDF could be an alternative via browser 🤔 ### Development Phases 1. Framework - Registration - Capabilities - Events? - Documentation 2. Converter Reference Implementation - OOTB experience. 3. Client usage - Expose in Web UI ### Questions - Order of multiple exporters for a requested type? - Hardcoded weight? Optional override for clients? Check with direct editing. - Direct editing takes the first best one - Conversion options might need to be passed on. What's the best way to provide and transport them? - How to render content that is not just a preview image? Even enriched images. How big is the issue? - In the document, only URLs are being stored, without more meta data - UX: In Text I do not see an option to add a speaking text to the link (even normally inserted)… - …but it works when it comes from proper markdown - Smart Pickers seem to have ReferenceProviders and references can (potentially) be matched against them. A Reference object is returned (or null). It contains meta data for non-vue rendering as well as an object dedicated to Vue. - Smart Pickers do only render standalone (including item of a list)! A picker embedded in text is not rendered in the web ui. - => Options - Leave links as it is - Turn link into markdown image (if present), use title as alt text if present. Maybe even have description (if present) following as another paragraph? Style…italique? quote? - Turn whole document into HTML and format using all possible fields - Callouts are custom?! (Neither Okular nor Ghostwriter render them at least) - Tables apparently not standard - Tasks apparently not standard - Where will converted files be stored? And how long will they live? - Ask design team, cf. OnlyOffice - keep in folder, but if write support (share) - Loading images with relative paths, how does it work? #### Side questions - Which related components do we actually use on Text fronted side? - TipTap does not actually support markdown? How do we deal with it? - How does an office document enriched by smart picker stuff look outside of Nextcloud, in LibreOffice? - Only links are inserted - Even without a speaking Link name 😞 - Also profile picker does not show Avatar - No previews, no pictures ### Leads - ueberdosis/pandoc - PHP wrapper for pandoc - requires pandoc locally - used by pandoc Nextcloud app - Setasign/FPDF - native PHP PDF generator library - development process looks a bit shady - around a lot, recommended most often - Tecnickcom/tcpdf - native PHP PDF generator library - new version in rewrite… this code is… aged - often recommended though - gotenberg_php - wrapper for Gotenberg service - Gotenberg bundles Chromium and LibreOffice in a docker to offer document conversions via API - requires an external services. For docker, cannot be deployed with the app. - ~~md2pdf~~ - WIP - ~~mdpf~~ - converts HTML to pdf - could be an option when rendering from `$editor.toHtml`export… - …which cannot really be used on backend side. Big limitation. - pandoc / pandoc-bin - …itself - project offers static binaries for 64bit, both AMD/Intel and ARM, but not 32 bit (how big of an issue is that?) - pandoc app - currently not in our scope - [use FFI to convert via a non-PHP library](https://ryangjchandler.co.uk/posts/blazingly-fast-markdown-parsing-in-php-using-ffi-and-rust) - could be a replacement for pandoc-bin and be shipped easily - y.js - could that be useful here, or totally out of place? - wasm-pandoc - seems unofficial without debug steps - frontend-only, too