mono / taglib-sharp

Library for reading and writing metadata in media files
GNU Lesser General Public License v2.1
1.27k stars 312 forks source link

RFC: New Tag property #69

Open Starwer opened 7 years ago

Starwer commented 7 years ago

[edit: I've fixed a few definition and added better examples] TagLibSharp deeply inherits from TagLib, which was initially targeting MP3 ID3 tags only. This legacy is visible in the choice of properties names/selection available in the base Tag class of TagLibSharp. Actually, with the addition of more Tag-formats and especially the support of Videos, I have the impression we should precise and extend the set of available Tag properties to handle the Videos better. Of course, these addition should also benefit to the audio counterpart.

Examples: [Back to the Future 2 / Game of Thrones Season 1, episode 9]

Any thought on that ?

Starwer commented 7 years ago

I've started the development of this feature.

rboy1 commented 6 years ago

Moving it over:

If we are adding new tags for supporting video (which I really like), here are some of the most commonly used tags used by video players and DVR systems across the world today:

  1. Title (present)
  2. Subtitle
  3. Genre (multiple/array or semicolon delineated) (present?)
  4. Network or Channel (present?)
  5. Description (present)
  6. Credits (multiple/array or semicolon delineated) (present?)
  7. Season (number)
  8. Episode (number)
  9. String to indicate type (movie, tv series, sports, news, documentary etc)
  10. Recorded Date and Time (UTC format?)
  11. Broadcast Date and Time (UTC format?)
  12. Premiere Date and Time (UTC format?)
  13. Rating (Parental)

Each platform has it's own specific tags like SageTV, nPVR, TiVO, HDHR, IceTV, Media Portal, WMC, Kodi etc but this was the list of the most common parameters I would gather between the platforms.

@Starwer what do you think?

Starwer commented 6 years ago

Currently, I think the current implementation (after integration of #71) covers points: 1, 2, 3, 5, 7, 8. Remains:

  1. Network or Channel

    Honestly, I don't see the point of this one... I couldn't care less about who broadcasted first a movie/serie... but why not...

  2. Credits (multiple/array or semicolon delineated)

    Performers/Roles is now present. I would suggest to use a Dictionary<string,string[]> for storing the remaining credits, associating the department (writer, director, music composer...) to a list of persons. To standardize the departments, I propose to use the Entities used in Matroska at https://www.matroska.org/technical/specs/tagging/index.html. This is, to my knowledge, the most complete standard.

  3. String to indicate type (movie, tv series, sports, news, documentary etc)

    I guess this could be a string field called: ContentType ?

  4. Recorded Date and Time (UTC format?)

  5. Broadcast Date and Time (UTC format?)

  6. Premiere Date and Time (UTC format?)

    For date, I'd rather use the DateTime?, as done in the (new) DateTagged field.

  7. Rating (Parental)

    What does it mean ? The minimum age required to watch a movie ? Or could you give the possible values of this one ?

  8. Rating

    I'd add also this one. Also taken from Matroska: "A numeric value defining how much a person likes the song/movie. The number is between 0 and 5 with decimal values possible (e.g. 2.7), 5(.0) being the highest possible rating. "

  9. Trailer/Clip URL

    Could be nice to access the trailers, or clip for a music ?

Except for Matroska format, most of these fields are non-standard.

elfalem commented 6 years ago

Instead of specifying individual tag names, it might be best to expose a general interface that allows reading and writing all types of tags. This way, the choice of what tags to use and what names they have are left to the consumer of this library. I think the design and architecture of the python mutagen project is quite informative for this project.

Starwer commented 6 years ago

@elfalem : You are absolutely right, we must keep focus on the user-interface, and derive the internals from that. But don't get me wrong, that's what we are doing here. I only mentioned Matroska as an inspiration of what could be the useful tags to add for Videos (and indeed, also as a reference for arbitrary names that can be used for the less well defined other standards). Mutagen is yet another well designed Audio Tag editor. There are surely a hand full of other projects that do this well. Fact is Audio tagging is well supported since 12-15 years. But, strangely, Video didn't get such an attention. It is nowadays still rare to see a properly tagged video, whereas Audio now are more of the times well tagged. That's the challenge I'd like to address in this Issue: How can we make Video Tag support as good as the Audio Tag support.

elfalem commented 6 years ago

@Starwer I agree tag support for videos is pretty much nonexistent compared to audio. The task of creating a standard set of properties is also challenging since music is a specific type of audio while video could be movies, shows, music videos, sporting events, etc... I think this challenge could be avoided by taking the same approach as mutagen.

Instead of defining properties such as file.Tag.Episode, a getter can be used for example as file.Tag.get("Episode"). This way, TagLib# doesn't have to maintain a list of properties. It will just read all the tags a file has and return the tag that the user wants if it exists.

Starwer commented 6 years ago

@elfalem : I see what you mean now... this is the usual custom-tag VS standard tags debate. Let me elaborate on that.

Let's accept our (or is it just mine) ambitions for TagLib#. If it becomes the first free library enabling to tag properly the videos of all types, which is now only a couple of man.days away, more and more softwares should start using it, instead of spending months of effort to do something not merely as good as this. Let's assume you get then a ecosystem where you have a tagger (scrapping data from Internet), a media-manager and a player for your videos, that all use TagLib#. Two cases:

  1. Let's assume TagLib# uses custom tags. People have all freedom over what tag they can use, how these are called and the format they want to have. They can unleashed their creativity in this unlimited world. Now the downside:

    • the tagger think that it is wise to write tags this way: Serie="Game of Thrones", Date="14th of April, 2011", Rating="8/10".
    • The manager will expect: Collection="Game of Thrones", RecordedDate="2017-04-14+10:13", Score="4".
    • The video player has decided that the correct way for tagging was: Group="Game of Thrones", DateRecord="04/14/2017 at 10:13 CET", Rating="16+". In other words, even though they use the same library to handle tags, they don't understand each other. The media-manager can not sort per date, because it doesn't find the date field it expects, nor does it understand the format of the date used. The rating is totally misinterpreted by the video player, as it expects a parental rating whereas the tagger calls Rating the score that people give to the movie. At the end, your tagging is rendered useless because the software can not interpret it correctly. And this is still an optimist case. Now imagine you get videos from different sources that have used different taggers, which all have of course there own way to tag and format, because there is no standard !
  2. TagLib# use well defined properties, and will try its best to map these to the standard fields to each Tagging formats (Mkv, ID3V2, Riff...). It still decides arbitrarily what non-standard field it will create when the property doesn't match any standard field from the tag specification. The Tagger, manager, and player will all write/expect:

    • Album="Game of Thrones", RecordedDate="2017-04-14 10:13", Score=4.0. They don't even have to parse the date and rating as these are provided to them as .NET types DateTime and double. They perfectly understand each other, without effort from their respective developer to do so. Your Tagging may still be incomplete compare to the extensive list of information you wanted to put in with custom tags. But you fully leverage what is actually tagged. For the specified fields, these are even compatible with the other software that do not use TagLib#, excepted the softwares that have chosen the custom-tag path, but those do not understand anybody but themselves anyway... About the non-standard (unspecified) tags created by TagLib#, they become a de facto standard in the industry, as TagLib# gets more and more used.

So in one sentence: no, I'm not in favor of custom-tags. TagLib# took the right path.

elfalem commented 6 years ago

@Starwer Thank you for listing detailed reasons. I also do hope that TagLib# becomes the go-to library for tagging audio and video files for .NET projects. However, I maintain that attempting to dictate a standard for tagging is not the best approach. If/when tagging video files becomes popular, a consensus will emerge on tagging conventions. Regardless, I do hope the best for this project.

Starwer commented 6 years ago

@elfalem : BTW, TagLib# already supports custom tags for most (if not every) format, but it is per-format. What would be missing is only a abstraction of the custom-tags. This could be done, although I still think it has lower priority to the support of standard tags (but if there are volunteers, please go ahead !).

In my experience, when we let everyone do the things as they want, and then try to standardize the things, this leads to a total confusion and the standard never picks up totally. At the end, to do a proper compatibility job, a software has then to support not only the standard, but every specifics stuff that the main players made up. Also their specific bugs. And you must support legacy constructs as well.

Think of the HTML and W3C. Back in the 90's, there were plethora of browsers doing their own implementation of what HTML should be. A website was developed with mention: "Optimized for Netscape", or "Optimized for Internet Explorer"... this was, from the beginning, a total mess for the end user, but also a total headache for the developer who was trying to please most browsers. Then W3C came and tried to standardize all that misery. But still, nowadays, this standard is not a total success. A lot of website still use the old constructs, the compliance is still poor, the compatibility cross-browsers is brittle. A lot of pain could have been avoided by doing things in the right order.

Custom constructs should be only a last resort fallback when the standard doesn't cover one aspect.

elfalem commented 6 years ago

I searched for examples of how a custom tag is set but was unable to find examples. Can you provide a sample of how this could be done currently (for example for a flac vorbis tag)?

I'm interested in contributing to the implementation of an abstracted custom-tags support although I understand it would have low priority. There are other areas such as the port to dotnet core and writing documentation that I would like to contribute as well.

Starwer commented 6 years ago

@elfalem : That's true, there are some hidden gems in TagLib# that are not properly documented...

This snippet should make your day:

rgFile = File.Create("my_file.flac"); var custom = (TagLib.Ogg.XiphComment) rgFile.GetTag(TagTypes.Xiph); custom.SetField("MY_TAG", new string[] { "value1", "value2" }); string [] myfields = custom.GetField("MY_TAG"); custom.RemoveField("OTHER_FIELD"); rgFile.Save();

The trick is to find the proper GetTag(<type>) to call. Then every different Tag type has its own interface to access the fields. That's where abstraction would make it a lot simpler... If you want to give it a go to develop it, please go ahead ! That's the good thing about a free software project: you don't care about road-maps and priorities... if you feel like adding a feature, just do it !

elfalem commented 6 years ago

@Starwer That did make my day :) Thanks so much for you help! I will take a stab at this once #60 is merged.

Starwer commented 6 years ago

@descriptor: Could this issue be labelled as "RFC"?