Stirling-Tools / Stirling-PDF

#1 Locally hosted web application that allows you to perform various operations on PDF files
https://stirlingpdf.com
MIT License
43.21k stars 3.49k forks source link

Uploading Multiple Files as ArrayBuffer #1027

Closed lawfulsoftware closed 1 month ago

lawfulsoftware commented 6 months ago

This is a feature request.

I am unable to upload multiple files using n8n. Because of the way it handles binary data, I cannot find a way to send multiple files as a binary array (e.g., using the merge-pdfs endpoint). This could also be an issue in a serverless environment.

Is it possible to modify or enhance the methods for passing files to Stirling-PDF?

Possible Strategies

URL Array

The user can send an array of URLs so Stirling-PDF can download the PDFs that need to be processed.

This approach allows for quicker transmission of the API call. If the URL is well-formed, Stirling-PDF can accept the payload. It avoids sending a single large payload (e.g., 75 MB) that can fail during transmission.

Stirling-PDF can download the files in parallel whereas a single binary array is single-threaded. Stirling-PDF can also retry any failed downloads without losing the other files that it has already downloaded.

Task Approach

An API endpoint can allow the user to declare a taskId (e.g., a uuid) and associate uploaded files with that taskId. The user can then reference the taskId in calls to the various API endpoints and the endpoint will process all files associated with that taskId.

Optionally, the user can declare the number of files that will be sent for this taskId such that Stirling-PDF can return an error if it has not yet received the required number of files along with an array of the filenames that it has already received.

ZIP Approach

Receive a ZIP file containing all of the PDFs that need to be processed.

Ingest Endpoint

Rather that each endpoint having to handle receiving data, a single API endpoint could ingest data using one or more of the above approaches. This would consolidate the process which may simplify the application logic and enhance maintainability. Optionally, the user could set a TTL to override any system defaults.

Data ingested by this single endpoint the files can be used by multiple API endpoints without Stirling-PDF having to receive the same data for each API call. To achieve this objective, the user could reference the taskId when calling the Pipeline API.

n8n References:

n8n Data Structure n8n Binary Data

user90210 commented 6 months ago

I'm able to send multiple files to stirling using the code node, but I'm unable to convert the response into a pdf. grafik grafik

lawfulsoftware commented 6 months ago

@user90210 Would you mind sharing the relevant code from your Code node?

As for your screenshot, the input is already binary so JSON to Binary wouldn't work.

user90210 commented 5 months ago

@lawfulsoftware Environment variables: NODE_FUNCTION_ALLOW_EXTERNAL form-data,axios Code { "meta": { "templateCredsSetupCompleted": true, "instanceId": "b8a6c5f125472722e6d0028d32185678e848973559dc7ce97c463718be4c99c7" }, "nodes": [ { "parameters": { "url": "https://wonderfulengineering.com/wp-content/uploads/2014/10/image-wallpaper-15.jpg", "options": { "response": { "response": { "responseFormat": "file" } } } }, "id": "4784947c-07ee-427a-8b91-9b3db24d582d", "name": "HTTP Request4", "type": "n8n-nodes-base.httpRequest", "typeVersion": 4.1, "position": [ -160, 1440 ] }, { "parameters": { "jsCode": "let FormData = require('form-data');\nlet formData = new FormData();\nlet binaryDataBufferItems = [];\n\nformData.append('fitOption','fitDocumentToImage');\nformData.append('colorType','color');\nformData.append('autoRotate','true');\n\nfor (let i = 0; i < items.length; i++) {\n\n formData.append('fileInput', await this.helpers.getBinaryDataBuffer(i, 'data'), 'image.jpg');\n}\n\n\nconst request_options = {\n url: 'http://10.10.200.46:8081/api/v1/convert/img/pdf',\n headers: {\n 'accept': '*/*',\n // 'Content-Type': 'multipart/form-data',\n // ...formData.getHeaders()// 'Content-Type': 'multipart/form-data',\n },\n method: 'POST',\n body : formData,\n // returnFullResponse: true,\n // encoding: 'blob'\n}\nconsole.log(request_options);\n\n\nconst response = await this.helpers.httpRequest(request_options);\nconsole.log(response);\n\nreturn {response}; \n\n//return {json: (response)}; \n \n\n" }, "id": "e6ca4bce-beba-4084-95bc-43deaa4e4dd4", "name": "Stirling PDF image to PDF", "type": "n8n-nodes-base.code", "typeVersion": 2, "position": [ 200, 1300 ] }, { "parameters": { "url": "https://wonderfulengineering.com/wp-content/uploads/2014/10/image-wallpaper-15.jpg", "options": { "response": { "response": { "responseFormat": "file" } } } }, "id": "129c0e4f-3f16-40f8-bbc7-e83b223812e9", "name": "HTTP Request6", "type": "n8n-nodes-base.httpRequest", "typeVersion": 4.1, "position": [ -160, 1220 ] }, { "parameters": {}, "id": "6be7403e-1e2b-4335-96c6-b36cfb53e07a", "name": "Merge1", "type": "n8n-nodes-base.merge", "typeVersion": 2.1, "position": [ 20, 1300 ] } ], "connections": { "HTTP Request4": { "main": [ [ { "node": "Merge1", "type": "main", "index": 1 } ] ] }, "HTTP Request6": { "main": [ [ { "node": "Merge1", "type": "main", "index": 0 } ] ] }, "Merge1": { "main": [ [ { "node": "Stirling PDF image to PDF", "type": "main", "index": 0 } ] ] } }, "pinData": {} }

(you can just copy the text to a n8n workflow) I did a lot of trial and error. Maybe you can make sense of it.

user90210 commented 5 months ago

@lawfulsoftware

I found the issue with my code.

I just had to add "encoding: 'arraybuffer'" to the headers. Turning that into a base64 and voila... (let me know if you have suggestions or improvements)

const FormData = require('form-data'); const formData = new FormData();

formData.append('fitOption','fitDocumentToImage'); formData.append('colorType','color'); formData.append('autoRotate','true');

for (let i = 0; i < items.length; i++) {

formData.append('fileInput', await this.helpers.getBinaryDataBuffer(i, 'data'), 'image.jpg'); }

const request_options = { url: 'http://192.168.144.68:8081/api/v1/convert/img/pdf', headers: { 'accept': '/', 'Content-Type': 'multipart/form-data', }, method: 'POST', body : formData, returnFullResponse: true, encoding: 'arraybuffer' } console.log(request_options);

const response = await this.helpers.httpRequest(request_options); //console.log(response);

return [{ json: response.headers, binary: { data: { mimeType: response.headers['content-type'], fileExtension: 'pdf', fileName: response.headers['content-disposition'].split("\"")[1], data: response.body.toString('base64') } } }];

Frooodle commented 1 month ago

Is this still a issue? Closing ticket for now