Serial-ATA / lofty-rs

Audio metadata library
Apache License 2.0
184 stars 34 forks source link

Parse ID3v2 timestamps #233

Closed Serial-ATA closed 4 months ago

Serial-ATA commented 1 year ago

Summary

We could parse timestamps in the following frames: TDEN, TDOR, TDRC, TDRL, TDTG.

Instead of FrameValue::Text, they would now use FrameValue::Timestamp.

The structure is described in the ID3v2.4 frame overview:

The timestamp fields are based on a subset of ISO 8601. When being as precise as possible the format of a time string is yyyy-MM-ddTHH:mm:ss (year, “-”, month, “-”, day, “T”, hour (out of 24), ”:”, minutes, ”:”, seconds), but the precision may be reduced by removing as many time indicators as wanted. Hence valid timestamps are yyyy, yyyy-MM, yyyy-MM-dd, yyyy-MM-ddTHH, yyyy-MM-ddTHH:mm and yyyy-MM-ddTHH:mm:ss. All time stamps are UTC.

API design

pub struct TimestampFrame {
    // Year is required, and ID3v2 does not allow negative years
    pub year: u16,
    pub month: Option<u8>,
    pub day: Option<u8>,
    pub hour: Option<u8>,
    pub minute: Option<u8>,
    pub second: Option<u8>,
}
pub enum FrameValue {
    Timestamp(TimestampFrame),
    /// ...
}
uklotzde commented 11 months ago

Leading zeros in month/day fields are often omitted or replaced by space(s). The parser should tolerate those violations from the standard, because they are very common.

If these recoverable errors could be detected easily then ParsingMode::Strict might reject them. If that would result in 2 different implementations than it's probably not worth the effort.