Closed ekremucar closed 9 years ago
Same issue for me. Is there a way to make it work anyway ?
Currently, the following matchers already exist with a higher precedence than the sgml matcher:
<match>
<mimetype>text/html</mimetype>
<extension>html</extension>
<description>HTML document text</description>
<test offset="0" type="string" comparator="="><!DOCTYPE HTML</test>
</match>
<match>
<mimetype>text/html</mimetype>
<extension>html</extension>
<description>HTML document text</description>
<test offset="0" type="string" comparator="="><!doctype html</test>
</match>
Does your document have something other than exactly the following at position 0 in the file? Note that the default matchers are exact matches and don't ignore whitespace, etc.
<!DOCTYPE HTML or <!doctype html
I found why the detection is not working. The file case is important and the file starts with: <!DOCTYPE html
There is no entry for this case. There is also no entry for the case <!doctype HTML.
Is there a way to indicate that the match is not case sensitive (maybe another comparator then =) ? If it doesn't exist, maybe it could be a good feature to add.
For the moment, I added two entries in my custom magic.xml file (but I also have to copy the dtd...).
You can, just not with the string matcher. You'll need to use the regex matcher type. See the magic.xml for a couple examples. Sorry for the bad paste above. There are existing matchers for this, which are actually regex already, they just aren't using the /i flag.
<match>
<mimetype>text/html</mimetype>
<extension>html</extension>
<description>HTML Document</description>
<test offset="0" type="regex" comparator="=">/^\s*<!DOCTYPE HTML PUBLIC/</test>
</match>
<match>
<mimetype>text/html</mimetype>
<extension>html</extension>
<description>HTML Document</description>
<test offset="0" type="regex" comparator="=">/^\s*<html>/</test>
</match>
i have tried to match an html file mime type detected sgml both starts with 'doctype' but html file continues with 'html' maybe it is required to order mathchers