Poking around in the file extensions data I've got from various places, I noticed that the XHTML PRONOM records (fmt/103, fmt/102) do not include xhtml as a possible file extension. The IANA Media Type registration for XHTML says:
File extension(s) : "xhtml" and "xht" are sometimes used.
Amusingly, it also says:
Magic number(s) : No sequence of bytes can uniquely identify an XHTML
document. More information on detecting XHTML documents is available in
the MIME Sniffing specification.
While extension-based matching is less than ideal, perhaps it's still worth adding the above file extensions as fallbacks?
In case it helps, here's a comparison with other format info sources: *.xht, *.xhtml
Poking around in the file extensions data I've got from various places, I noticed that the XHTML PRONOM records (fmt/103, fmt/102) do not include
xhtml
as a possible file extension. The IANA Media Type registration for XHTML says:Amusingly, it also says:
While extension-based matching is less than ideal, perhaps it's still worth adding the above file extensions as fallbacks?
In case it helps, here's a comparison with other format info sources: *.xht, *.xhtml