mbeijen / File-MimeInfo

Perl module for determining file types using the freedesktop.org shared mime-info database
https://metacpan.org/module/File::MimeInfo
20 stars 14 forks source link

Not identifying file of the type text/html if file does not contain the "<html>" tag. #19

Closed jg0000 closed 7 years ago

jg0000 commented 8 years ago

lsb_release -a LSB Version: core-2.0-amd64:core-2.0-noarch:core-3.0-amd64:core-3.0-noarch:core-3.1-amd64:core-3.1-noarch:core-3.2-amd64:core-3.2-noarch:core-4.0-amd64:core-4.0-noarch:core-4.1-amd64:core-4.1-noarch:security-4.0-amd64:security-4.0-noarch:security-4.1-amd64:security-4.1-noarch Distributor ID: Ubuntu Description: Ubuntu 14.04.3 LTS Release: 14.04 Codename: trusty

aptitude show libfile-mimeinfo-perl Package: libfile-mimeinfo-perl
State: installed Automatically installed: no Version: 0.22-1 Priority: optional Section: perl Maintainer: Ubuntu Developers ubuntu-devel-discuss@lists.ubuntu.com Architecture: all Uncompressed Size: 142 k Depends: perl, libfile-basedir-perl, libfile-desktopentry-perl, shared-mime-info Description: Perl module to determine file types File::MimeInfo can be used to determine the mime type of a file. It tries to implement the freedesktop specification for a shared MIME database.

This package also contains two related utilities:

If a file does not contain the <html> tag, /usr/bin/mimetype identifies the file as text/plain. /usr/bin/file seems to be more intelligent, or more tolerant, in this regard.

mbeijen commented 8 years ago

Can you please provide an example of a case that mimetype does NOT handle, but /usr/bin/file does handle correctly?

jg0000 commented 8 years ago

Hi Michiel

Please see attached test file.

My tests:

/usr/bin/mimetype /tmp/mimetestfile

/tmp/mimetestfile: text/plain

file /tmp/mimetestfile

/tmp/mimetestfile: HTML document, ISO-8859 text, with very long lines

Regards,

Jie

Date: Thu, 31 Dec 2015 02:07:00 -0800 From: Michiel Beijen notifications@github.com To: mbeijen/File-MimeInfo File-MimeInfo@noreply.github.com CC: jg0000 j.gao@sydney.edu.au Subject: Re: [File-MimeInfo] Not identifying file of the type text/html if file does not contain the "" tag. (#19)

Can you please provide an example of a case that mimetype does NOT handle, but /usr/bin/file does handle correctly?


Reply to this email directly or view it on GitHub: https://github.com/mbeijen/File-MimeInfo/issues/19#issuecomment-168161023

�� FREE SHIPPING - TODAY ONLY      
Hurry! Offer ends tonight!Free Regular Shipping or $5.95 Express - TODAY ONLY
Thanks for reading!
Cable Chick Signature
Cable Chick and the Team

Cable Chick's Latest Blogs

What is USB OTG and What Can It Do?

What is USB OTG and What Can It Do?

USB OTG (On-the-Go) is a powerful feature of many Android smartphones and tablets. Learn how to take advantage of it today!   Read More
Product Launch - Cat6 Colour Range & New Cat6A 500Mhz cables

Product Launch - Cat6 Colour Range & New Cat6A 500Mhz cables

Colour code your home and office networks with our new rainbow range of Category 6 cables and travel to the future with CAT6A!   Read More
Why does my Amplifier use Negative dB for Volume?

Why does my Amplifier use Negative dB for Volume?

Have you ever wondered why your home theatre receiver shows volume as a negative number? Wonder no more!   Read More

Flat Rate Shipping on Regular and Express Services

Please read the full terms and conditions available on our website.
Savings based on RRP. Prices and Specifications subject to change without notice.
Promotion valid until 11:59pm AEST Friday January 1st 2016 or until sold out. Sorry, no rainchecks.

Cable Chick Website    Facebook      Twitter
Cable Chick Accepts: American Express, Visa, Mastercard, Paypal and more

Product and Gift stock is limited and may sell out at any time. Prices and Prizes are subject to change.
� 2006-2016 www.CableChick.com.au. All rights reserved.

This message was sent to the following email address: j.gao@sydney.edu.au
We hope you find this message useful, however if you would rather not receive any more
Cable Chick Newsletters, please click here to unsubscribe. Or Log in to manage your Subscriptions.

mbeijen commented 7 years ago

Determining that a file is an HTML file when it is called something and not something.html and it merely contains some tags that look like HTML but not the required <html> - I'm not sure if I'd find it correct, it might lead to other sorts of problems. File::MimeInfo does not need to be bug-compatible with /usr/bin/file. Closing.