mmalecot / file-format

Crate for determining the file format of a given file or stream
Apache License 2.0
87 stars 14 forks source link
file-format file-type magic-number media-type mime rust

file-format

Build Crates.io Docs Rust License

Crate for determining the file format of a given file or stream.

It provides a variety of functions for identifying a wide range of file formats, including ZIP, Compound File Binary (CFB), Extensible Markup Language (XML) and more.

It checks the signature of the file to determine its format and intelligently employs specific readers when available for accurate identification. If the signature is not recognized, the crate falls back to the default file format, which is Arbitrary Binary Data (BIN).

Examples

Determines from a file:

use file_format::{FileFormat, Kind};

let fmt = FileFormat::from_file("fixtures/document/sample.pdf")?;
assert_eq!(fmt, FileFormat::PortableDocumentFormat);
assert_eq!(fmt.name(), "Portable Document Format");
assert_eq!(fmt.short_name(), Some("PDF"));
assert_eq!(fmt.media_type(), "application/pdf");
assert_eq!(fmt.extension(), "pdf");
assert_eq!(fmt.kind(), Kind::Document);

Determines from bytes:

use file_format::{FileFormat, Kind};

let fmt = FileFormat::from_bytes(&[0xFF, 0xD8, 0xFF]);
assert_eq!(fmt, FileFormat::JointPhotographicExpertsGroup);
assert_eq!(fmt.name(), "Joint Photographic Experts Group");
assert_eq!(fmt.short_name(), Some("JPEG"));
assert_eq!(fmt.media_type(), "image/jpeg");
assert_eq!(fmt.extension(), "jpg");
assert_eq!(fmt.kind(), Kind::Image);

Usage

Add this to your Cargo.toml:

[dependencies]
file-format = "0.25"

Crate features

All features below are disabled by default.

Reader features

These features enable the detection of file formats that require a specific reader for identification.

Supported file formats

Archive

Audio

Compressed

Database

Diagram

Disk

Document

Ebook

Executable

Font

Formula

Geospatial

Image

Metadata

Model

Other

Package

Playlist

Presentation

ROM

Spreadsheet

Subtitle

Video

Fixtures

The fixtures are samples of file formats used for testing purposes, located in the fixtures directory and organized by kind in subdirectories. These samples are often intentionally truncated to reduce size, which can sometimes prevent them from being fully decoded by compatible software.

License

This project is licensed under either of: