Serial-ATA / lofty-rs

Audio metadata library
Apache License 2.0
179 stars 34 forks source link

Stream picture tags #345

Open probablykasper opened 5 months ago

probablykasper commented 5 months ago

Summary

The ability to read the picture tags as a stream for improve performance

API design

let probe = lofty::Probe::open(path);

let tags_reader = probe.into_reader();

// Consumes tags_reader and returns an iterator that skips non-picture tags
let pictures_iterator = tags_reader.pictures();

let first_picture_reader = pictures_iterator.next()?;

let bytes = picture_reader.data; // std::io::Bytes<> iterator
let mime_type = picture_reader.mime_type; // Access other info about the picture
Serial-ATA commented 5 months ago

Thanks for the suggestion. Could you explain a little more what you're looking for, or your use case for this? I'm not sure I entirely understand what's being suggested. I interpret it as turn Lofty into an event-based parser like xml-rs.

probablykasper commented 5 months ago

Yeah I guess it is like xml-rs.

Use case

For my music player, I'm passing song artworks into a webview to show them. I'd like to make that faster by only parsing the first picture (skipping the other items), and to stream the picture to the webview while it's being parsed.

Screen recording https://github.com/Serial-ATA/lofty-rs/assets/11315492/68d56822-ed3a-4c33-add8-8a5cd02bbb53

Another thing I didn't think of is the ability to cancel mid-parsing, that would probably be the most significant improvement.

Serial-ATA commented 5 months ago

I have thought about an event-based parser before. Lofty currently doesn't offer much caller control over the parsing process, which I don't like.

I'd like Lofty to be a low level crate that people can build abstractions over. TaggedFile and Tag could still exist as abstractions that simply retain all information from every event.

This would, however, require a decent sized rewrite of the crate. I'm not opposed to doing that, but there'd need to be a more fleshed out design first.

Some initial thoughts:

let tags_reader = probe.into_reader();

Since Probe, TaggedFile, and Tag need to be backed by concrete formats, a generic event stream could exist where new items first have their key converted to an ItemKey. These events would be separate from keys that cannot be represented as an ItemKey. That could finally remove ItemKey::Unknown. That'd be great.

Sketch of what I mean:

enum EventKind {
    Item(ItemKey),
    UnknownItem(String),
}

while let Some(e) = events.next() {
    match e {
        Ok(EventKind::Item(_key)) => e.skip(),
        Ok(EventKind::UnknownItem(key)) => println!("Encountered unknown item: {key}!")
    }
}

let bytes = picture_reader.data; // std::io::Bytes<> iterator let mime_type = picture_reader.mime_type; // Access other info about the picture

In APE, for example, we do not have access to the MIME type information ahead of time. We actually read some of the picture data to determine that. Other formats don't have that issue, though.

probablykasper commented 5 months ago

Yeah that sounds great, though I understand it would be a lot of work

Serial-ATA commented 5 months ago

Given this will be a rewrite and its something I've wanted to do anyway, I probably won't work on anything new till this is done. I'm going to take a break from Lofty and work on this when I come back. Hopefully I'm able to make this a reality.

probablykasper commented 5 months ago

Sounds good! Take all the time you need, appreciate your work :DD

uklotzde commented 5 months ago

Sounds good! Take all the time you need, appreciate your work :DD

Want to chime in. It's a pleasure to work with you, @Serial-ATA!