ONLYOFFICE / Docker-DocumentServer

ONLYOFFICE Document Server is an online office suite comprising viewers and editors for texts, spreadsheets and presentations, fully compatible with Office Open XML formats: .docx, .xlsx, .pptx and enabling collaborative editing in real time.
GNU Affero General Public License v3.0
1.37k stars 470 forks source link

[Feature Request]: Do some real PDF edition #718

Closed louiesun closed 5 months ago

louiesun commented 5 months ago

What is the current behavior? We can annotatate the PDF now.

What is the expected behavior? be able to edit the text on PDF and even add some pictures or svgs on it like Foxit Advanced PDF editor or Acbrot.

Did this work in previous versions of DocumentServer? no, maybe a quite new feature.

I am using Foxit PDF Editor(a famous commencial PDF edtior), and I found that it needs some time to save the change while the comments in the PDF disappeared.

The original HelloWorld PDF file:


%PDF-1.4
1 0 obj
<< /Type /Catalog
/Outlines 2 0 R
/Pages 3 0 R
>
endobj

2 0 obj << /Type Outlines /Count 0

endobj

3 0 obj << /Types /Pages /Kids [4 0 R] /Count 1

endobj

4 0 obj << /Type /Page /Parent 3 0 R /MediaBox [ 0 0 612 792] /Contents 5 0 R /Resources << /ProcSet 6 0 R /Font << /F1 7 0 R>>

5 0 obj << /Length 73 >> stream BT /F1 24 Tf 100 100 Td (Hello World) Tj ET endstream endobj

6 0 obj [/PDF /Text] endobj

7 0 obj << /Type /Font /Subtype /Type1 /Name /F1 /BaseFont /Helvetica /Encoding /MacRomanEncoding

endobj

xref 0 8 0000000000 65535 f 0000000009 00000 n 0000000074 00000 n 0000000120 00000 n 0000000179 00000 n 0000000364 00000 n 0000000466 00000 n 0000000496 00000 n

trailer << /Size 7 /Root 1 0 R

startxref 625 %%EOF


when I save it as on `Foxit Advanced PDF editor`. it turns into(with some streams):
```text  
%PDF-1.4
%〕抛
1 0 obj
<</Pages 3 0 R /Type/Catalog>>
endobj
4 0 obj
<</Resources<</ProcSet 6 0 R /Font<</F1 7 0 R >>>>/MediaBox[ 0 0 612 792]/Contents 5 0 R /Parent 3 0 R /Type/Page>>
endobj
5 0 obj
<</Length 55/Filter/FlateDecode>>stream
x淪Pp
徨R }7C#厫4仔?孋R |
徳湝|咅?M厫,惃k q??
endstream
endobj
8 0 obj
<</Type /ObjStm /N 3/First 14/Length 129/Filter /FlateDecode>>stream
x?V0P0S0禤0W0盩氨??H-?HL掁?叛
&@%A
柄矽%
唙v漾.n??%盄
N壟﹏鵼%??e?櫳夲畒声)檡辁緣葾墆p佮窑?`[沥>X痏b.?4 :?C
endstream
endobj
9 0 obj
<</Type /XRef/W[1 4 2]/Index[0 10]/Size 10/Filter /FlateDecode/DecodeParms<</Columns 7/Predictor 12>>/Length 59/Root 1 0 R /ID[<C20C135AE07F0687CFC4B813E5B68C5B><C20C135AE07F0687CFC4B813E5B68C5B>]>>stream
x?时
€PB?暛隭Sk7?熸勪歯?讳?豵孆Glv莢ZV瘬痏鼈?C
endstream
endobj

startxref 555 %%EOF



So, I suppose that even the most advanced PDF editor also render the PDF file to some canvas like `html` or `xml` or `canvas` and exporting them into PDF file again when saving.
There's few reason for us to do it better. 

We've `PDF.js` to render PDF page on `canvas` Element and `Fabric.js` to create an editable 'canvas'. 

Does this means we can got a PDF editor through translate the Canvas API which `PDF.js` use to `Fabric.js` API to edit the PDF page? 

> In addition, there are a lot of useful Image editors, they are mature techonology(`Fabric.js` is the basement of some of them), the editors based on absolute positition while PDF does so. 

Though we may need to deal with the bookmarks and so on. 

In brief, can we make a simple PDF editor which nobody made an opensource one so far especially on web? With the solution:
1. render the PDF pages on `fabric.js` canvas(one canvas per page) with `Pdf.js`. 
2. Let the user edit it. (Maybe need to write more GUIs). 
3. Export the `fabric.js` canvases to PDF pages and deal with the bookmarks, comments and metadatas.
4. Saving files.