sidewinderlabs / xlwrap

Fork of the excellent XLWrap on Sourceforge
http://xlwrap.sourceforge.net
Other
12 stars 6 forks source link

Detect file type from MIME type rather than extension #4

Open johngriffin opened 13 years ago

johngriffin commented 13 years ago

When there is no extension, xlwrap currently guesses the file type as being csv. see:

https://github.com/markbirbeck/xlwrap/commit/ef3b0096d89bcd3472d5a2b1b65aefae6c08d0c6

It would be better to use the MIME type to make work out the file type.

antiguru commented 13 years ago

Something like that: http://tika.apache.org/ or is it overkill? I fully agree that there is some change required.

johngriffin commented 13 years ago

Looks like it would do the trick, but might be a bit heavy for just mime-type detection. A lo-fi alternative might be to use javax.activation.MimetypesFileTypeMap. Seems that the tradeoff would be speed of detection vs weight of code dependency. We'd also probably have to manually list some of the mime types we're detecting with javax.activation - but that's possible and not such a big deal since we only need to support csv, excel and openoffice.

johngriffin commented 13 years ago

We've committed an implementation of MIME detection, using tika, to my fork of xlwrap, see commit here:

https://github.com/johngriffin/xlwrap/commit/dfa2bc962aab35a741cb9b7648b2ef35f8fa1d8a