TianyiShi2001 / audiotags

Unified IO for different types of audio metadata
https://tianyishi2001.github.io/audiotags
MIT License
41 stars 29 forks source link

Add `Tag::date_raw` to get date string before `Timestamp` conversion #45

Open mnpqraven opened 4 months ago

mnpqraven commented 4 months ago

Right now date() will always attempt a conversion to Timestamp, and if the date format is not correct then not all fields will be returned. I think this can be a good escape hatch for getting the original date and parse to one's own format if needed.

Serial-ATA commented 3 months ago

@pinkforest This looks fine.

@mnpqraven out of curiosity, what format are your dates in?

mnpqraven commented 3 months ago

@pinkforest This looks fine.

@mnpqraven out of curiosity, what format are your dates in?

@Serial-ATA they are mostly in 'YYYY.MM.DD' so most of the parsing would fail or only the year is recognized and the days and months are 'None'

Serial-ATA commented 3 months ago

Is that by choice, or is there some software out there writing dates like that? Both ID3v2 and Vorbis Comments are specified to have the yyyy-MM-ddTHH:mm:ss format.

pinkforest commented 3 months ago

Thanks you both - Could we by any chance put these behind something like raw opt-in non-default feature ?

I suspect exposing raw by-default would be hazardous as people may expect that it's well formed but it may have been injected with hazards where they must do their own sanitisation and payload checks for non-standard dates.

Should the date parsing be opportunistic conditional e.g. to handle YYYY.MM.DD ? And where some fields are empty etc.

Then also the problem with that is US/Rest date divide YYYY.DD.MM vs YYYY.MM.DD etc. big headache handling those yeah

Serial-ATA commented 3 months ago

I think just having a doc comment explaining that the output could very well not be spec-compliant would be enough. In 99% of cases Tag::date() will give you the correct output, since the format of the dates is specified to follow ISO 8601.

pinkforest commented 3 months ago

Ok if that is the case and we worry about ergonomic overs spec complianc e-

Then why not change so it gets properly used and handled via enum-data:

pub enum TimestampTag {
   /// [`id3::Timestamp`]
   Id3(id3::Timestamp)
   /// Unknown
   Unknown(String)
}
    fn set_date(&mut self, date: TimestampTag) {
    }

That ensures people are aware of it and handle it to some level.

Serial-ATA commented 3 months ago

That might be nicer, @mnpqraven would you be interested in implementing it that way?

mnpqraven commented 3 months ago

Then why not change so it gets properly used and handled via enum-data:

pub enum TimestampTag {
   /// [`id3::Timestamp`]
   Id3(id3::Timestamp)
   /// Unknown
   Unknown(String)
}

I've added the raw-date feature that includes the function and the Unknown variant of the enum, so now default users will only see Id3.

    fn set_date(&mut self, date: TimestampTag) {
    }

That ensures people are aware of it and handle it to some level.

Would changing from Timestamp to TimestampTag be a breaking change or major change? Right now I've update set_date to be using the above but i'm wondering if there's a way we can avoid changing the api

pinkforest commented 3 months ago

Would changing from Timestamp to TimestampTag be a breaking change

Yes but it's okay as we can bump and it will be improve the API. Alternatively we can mark old functions #[deprecated] and add another that returns the enum but I don't think it's worth the hassle given we want people to handle special cases.