pofider / phantom-html-to-pdf

Highly scalable html to pdf conversion using phantom workers
MIT License
159 stars 33 forks source link

Race conditions when generating PDF files #76

Closed VinceG closed 7 years ago

VinceG commented 7 years ago

Hey,

We had some issues using this package while converting multiple PDFs at the same time, we experienced some race conditions due to the fact that the temporary file generated uses the uuid v1 which is based off a timestamp.

Note

Although we are able to move this file anywhere we want once it's generated this still remains an issue. since the way it's currently setup it overwrites the temporary file if they are both being run at the exact same time.

I've created a PR to address this by applying the following fixes:

  1. For generating the temporary PDF file i've changed the uuid v1 to uuid v4 by default. This allows for a more random hash vs just a plain hash of the timestamp.
  2. I've added an option to specify the name of the temporary generated file, for us it'll be helpful to prefix the temporary file with our internal IDs for better debugging purposes, it'll also allow us to make the filename more random to avoid any other race conditions (it's optional and will use the uuid by default)

Example of race conditions currently caused by using the uuid v1

.../T/cb344660-999f-11e7-b7cd-4feaa764a9ee.pdf .../T/cb346d70-999f-11e7-b7cd-4feaa764a9ee.pdf .../T/cb346d71-999f-11e7-b7cd-4feaa764a9ee.pdf .../T/cb346d72-999f-11e7-b7cd-4feaa764a9ee.pdf .../T/cb349480-999f-11e7-b7cd-4feaa764a9ee.pdf .../T/cb349481-999f-11e7-b7cd-4feaa764a9ee.pdf .../T/cb349482-999f-11e7-b7cd-4feaa764a9ee.pdf .../T/cb349483-999f-11e7-b7cd-4feaa764a9ee.pdf .../T/cb349484-999f-11e7-b7cd-4feaa764a9ee.pdf .../T/cb349485-999f-11e7-b7cd-4feaa764a9ee.pdf .../T/cb349486-999f-11e7-b7cd-4feaa764a9ee.pdf .../T/cb341f50-999f-11e7-b052-f567c6d77af1.pdf .../T/cb344660-999f-11e7-b052-f567c6d77af1.pdf .../T/cb344661-999f-11e7-b052-f567c6d77af1.pdf .../T/cb344662-999f-11e7-b052-f567c6d77af1.pdf .../T/cb344663-999f-11e7-b052-f567c6d77af1.pdf .../T/cb346d70-999f-11e7-b052-f567c6d77af1.pdf .../T/cb346d71-999f-11e7-b052-f567c6d77af1.pdf .../T/cb346d72-999f-11e7-b052-f567c6d77af1.pdf

bjrmatos commented 7 years ago

We had some issues using this package while converting multiple PDFs at the same time, we experienced some race conditions due to the fact that the temporary file generated uses the uuid v1 which is based off a timestamp.

interesting, uuid v1 should be completely safe when just using a single process, since it is based in the timestamp of the engine. does your application is running in multiple processes? are you using something like the nodejs cluster module or something like that? so far the only issue that comes to my mind about the usage of uuid v1 is when you have an app with more than 1 process which are trying to generate and write files to the same temp folder. if you are in that situation the problem can be easily solved by specifying a different temp folder for each process in the app, with that you won't have any collisions.

however i agree that probably is better to use uuid v4 (in fact i'm using v4 in another module), probably it will be a good idea to do a benchmark and see if there is an impact in performance when switching to uuid v4 and generating a lot of pdf files.

VinceG commented 7 years ago

@bjrmatos We do use a cluster, however, this happens also when simply running tasks in parallel. If you create a file that generates 100 PDFs and run those in parallel it'll have this race condition.

Specifying a temp folder per process is harder than just making sure the temporary file generated is unique by using a more unique approach for temporary file generation (or allowing us to specify it at the very least).

We switched to v4 internally and saw no difference in CPU usage and memory consumption, although we didn't monitor or benchmarked the process time it seemed to be processing the conversions at the same rate as v1.

pofider commented 7 years ago

Merged in #75

pofider commented 7 years ago

Released in 0.5.5