neilharvey / FileSignatures

A small library for detecting the type of a file based on header signature (also known as magic number).
MIT License
250 stars 41 forks source link

DetermineFileFormat returns null in case of "svg" or "ico" file. #69

Closed marwa-elfakir-logi closed 1 month ago

marwa-elfakir-logi commented 1 month ago

The function DetermineFileFormat determines the format and extension well when i tested with "png" or "jpg" image but it returns null when i test with an "svg" or "ico" file

neilharvey commented 1 month ago

Hey, neither of these formats are currently included in the default formats which is why they are not currently recognised. ICO appears to have a signature of 00 00 01 00 so should be fairly straightforward to implement.

SVG however is a XML file so cannot reliably be detected from the file header alone.

As an example,

<svg version="1.1" width="300" height="65" xmlns="http://www.w3.org/2000/svg">
  <text x="150" y="50" font-size="50" text-anchor="middle" fill="black">Hello, World</text>
</svg>

is a valid SVG file, but so is:

<!--
 Renders "Hello, World"
-->
<svg version="1.1" width="300" height="65" xmlns="http://www.w3.org/2000/svg">
  <text x="150" y="50" font-size="50" text-anchor="middle" fill="black">Hello, World</text>
</svg>

Technically it would be possible to write a format which scans for an <svg fragment, but since it might not be at the start of the file you would need to read an arbitrary amount of bytes in order to determine whether the file was a valid SVG or not which would impact the performance of the library.

This approach might also incorrectly identify a HTML file as an SVG since this is a valid HTML file:

<html lang="en">
  <title>Hello, World!</title>
  <body>
    <svg version="1.1" width="300" height="65" xmlns="http://www.w3.org/2000/svg">
      <text x="150" y="50" font-size="50" text-anchor="middle" fill="black">Hello, World</text>
    </svg>
  </body>
</html>

I'll create an ICO format that we can include in the main project but I think that recognising text-based formats such as SVG is beyond the scope of this library, apologies.

marwa-elfakir-logi commented 1 month ago

I understand. Thank u