opencats / OpenCATS

Applicant Tracking System (maintained code base)
http://www.opencats.org
Other
532 stars 245 forks source link

Conversion of RTF/DOCX/ODT is not proper for some files #150

Open skrchnavy opened 8 years ago

skrchnavy commented 8 years ago

This is follow up of PR #149 / bug #105.

Implementation of conversion works OK when the file is simple and 'standard' For some nonstandard cases it could cause strange behavior. These 3 file types are processed different way comparing to DOC or PDF.

Example - when pdf file is renamed to rdf (extension changed) and then attahced as rdf to openCATS, it can cause openCATS hangup.

RussH commented 8 years ago

why is a PDF renamed to rdf? surely then it wouldn't open in a PDF viewer? I'd have thought that filtering wouldn't touch anything that isn't pdf / doc / docx / odf, etc, etc

skrchnavy commented 8 years ago

I accidentally wrote rdf, shall be rtf. (or could be also issue with rdf extension, now I am not sure)

This is case when user wants to break functionality, (s)he can upload document with not proper extension, can cause described result.

RussH commented 8 years ago

ahh, so it's trying to use unRTF on a PDF and hangs (?)


Russ

On 15 November 2016 at 11:56, Sveto Krchnavy notifications@github.com wrote:

I accidentally wrote rdf, shall be rtf.

This is case when user wants to break functionality, (s)he can upload document with not proper extension, can cause described result.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/opencats/OpenCATS/issues/150#issuecomment-260622398, or mute the thread https://github.com/notifications/unsubscribe-auth/AARQfIG2H9FIUP8xr_4xb41hatDDAm7Uks5q-Z3UgaJpZM4KYp6F .

skrchnavy commented 8 years ago

yes, something like that, don't remember now how i changed suffix of PDF file. We fixed just warnings in mentioned PR#149 and observed this issue.

mlespiau commented 8 years ago

This problem can be fixed by detecting the file type based on it's magic numbers. PHP file info function uses libmagic to do this.

This seems to be the most accurate way to detect file types (MIME types or extensions are not good).