quodlibet / mutagen

Python module for handling audio metadata
https://mutagen.readthedocs.io
GNU General Public License v2.0
1.53k stars 158 forks source link

Support RIFF INFO chunk metadata (AVI, WAV, XMA, xWMA, RMI, DLS) #207

Open lazka opened 9 years ago

lazka commented 9 years ago

Originally reported by: Freso Fenderson (Bitbucket: Freso, GitHub: Freso)


https://en.wikipedia.org/wiki/Resource_Interchange_File_Format#Use_of_the_INFO_chunk
https://en.wikipedia.org/wiki/Resource_Interchange_File_Format#INFO_chunk_placement_problems
https://en.wikipedia.org/wiki/Resource_Interchange_File_Format#RIFF_Info_Tags

See also discussion downstream in https://github.com/sampsyo/beets/issues/1160 and http://tickets.musicbrainz.org/browse/PICARD-653


lazka commented 8 years ago

Original comment by Christoph Reiter (Bitbucket: lazka, GitHub: lazka):


(Note that you should use mutagen and not mutagenx which was a Python3 fork which is no longer in development)

lazka commented 8 years ago

Original comment by Amias Channer (Bitbucket: amias_channer, GitHub: Unknown):


I could really use this functionality and would be happy to help code and test it, i'm new to mutagenx internals so might need some pointers.

I am currently using mutagenx in some code to test an audio playing system and have a library of sample files with which to test it.

lazka commented 8 years ago

Original comment by Christoph Reiter (Bitbucket: lazka, GitHub: lazka):


Some thoughts on the API:

Add a RIFFFileType and a RIFFInfoTags for exposing the info block. RIFFFileType takes either ID3, EasyID3 or RIFFInfoTags as tagging format.

Make RIFFFileType subclasses for WAV, AVI, RMI etc.

One problem regarding RIFFInfoTags is the text encoding of the info block. Windows uses latin-1 and VLC uses utf-8 for example.

Possible solutions there is to expose the values as bytes or allow passing in a preferred encoding (defaulting to latin-1)

andimarafioti commented 8 years ago

I would like to contact Amias Channer in order to work on the wav support. Does anyone here know where I could find him?

arigit commented 8 years ago

+1 on mutagen support for WAV files. Trying to use python's wave module to find bitsPerSample and sampleRate for WAVs created by ffmpeg from 24-bit flacs, and it fails miserably.

ffmpeg -i CDImage.flac -acodec pcm_s24le output.wav ... wave.Error: unknown format: 65534

Already using mutagen to extract the same info from FLACs so being able to use it for WAV would be ideal

jcea commented 7 years ago

I am also interested on this. I have WAV and WMA files I would like to be able to parse. Thanks.

Borewit commented 7 years ago

Started initial RIFF/WAVE implementation. I start with being able to read basic stream information such as:

Although this is not an official standard (neither there is a much better alternative), I will store metadata using ID3v2.

I will use MusicBrainz Picard and my own library music-metadata to cross check functionality.

Constructive contribution is more then welcome, I have 0 experience with Python.

Borewit commented 7 years ago

@lazka, can you help me out with some coding?

        self._RiffFile__fileobj.seek(self.__next_offset)
        self._RiffFile__fileobj.write(pack('<4si', id_.ljust(4).encode('ascii'), 0))
        self._RiffFile__fileobj.seek(self.__next_offset)
        chunk = RiffChunkHeader(self._RiffFile__fileobj)
        self[u'RIFF']._update_size(self[u'RIFF'].data_size + chunk.size)

(source: /mutagen/wave.py:190)

I want to access __fileobj declared in the parent (super) class RiffFile. I only get it to work using self._RiffFile__fileobj, and not using self.__fileobj.

lazka commented 7 years ago

See https://docs.python.org/3.6/tutorial/classes.html#private-variables

Borewit commented 7 years ago

@lazka: Thanks!

How to store metadata in RIFF/WAV files is not straight forward thing:

Looks like there are two options (which can be combined):

  1. Store metadata in the RIFF-LIST-INFO chunk
  2. Store metadata in non-standard, but better supported: RIFF/ID3v2.3

There is also some inconsistency in storing the ID3v2.3 chunk, some application use 'id3 ', others use 'ID3 '. This is how some applications handle the RIFF/WAV metadata:

Mp3Tag v2.8 Windows 10 Explorer Foobar 1.3.14 Windows 10 Explorer Media Player
Reads RIFF/LIST-INFO
Writes RIFF/LIST-INFO does not write does not write
Writes RIFF/ID3v2 does not write does not write
Reads RIFF/'id3 '/ID3v2
Reads RIFF/'ID3 '/ID3v2
Writes RIFF/ID3v2 chunk-id 'ID3 ' does not write 'id3 ' does not write

Notice that I did not found a single application which was able read from the RIFF/INFO tag; although it is apparently written in addition to the ID3v2.3 chunk.

Useful stuff:

lazka commented 7 years ago

Looks like there are two options (which can be combined):

[edit: oops misread.. ignore previous]

This is how some applications handle the RIFF/WAV metadata:

Cool, thanks for testing all those.

Last time I checked for AVI I think I remember that VLC and Windows explorer did read the INFO block there. But I might be miss-remembering or it only does so for AVI and not for WAVE.

Ignoring the INFO chunk seems fine for now..

Borewit commented 7 years ago

Since metadata in combination with RIFF/WAVE is known to be tricky, it was just to get a general understanding how other application deal with it. Especially the ID3v2 header is arguable from standardization point of view, but it looks like it this is what is used by most implementations.

akirayamamoto commented 6 years ago

Is there any other alternatives to write ID3 and RIFF/WAVE metadata? I tried to use bwfmetaedit before or after Mutagen with no success. If I try bwfmetaedit last it hangs, if I try Mutagen last I get no metadata at all. Tks

Borewit commented 6 years ago

Did you try Mp3Tag already?

akirayamamoto commented 6 years ago

I need a library or a command line tool which runs on Linux bash. I tried kid3-cli but I can't make it run without the X server. Even this command line version needs X server to run (weird). As an alternative I am forking this project to write ID3 tags in WAV files: https://github.com/jhorology/gulp-maschine-id3

My fork (nothing commited yet): https://github.com/akirayamamoto/gulp-wav-id3

wsngamerz commented 6 years ago

Sorry for asking but is there any ETA for #321 ?

Borewit commented 6 years ago

Sorry for asking but is there any ETA for #321 ?

The short answer is no. In principle it is implemented and I would love to finish it, but I am stuck with some failing unit test. I lack the understanding what the test is exactly about and to isolate the execution of it so I can debug it.

Ref: https://github.com/quodlibet/mutagen/pull/321#issuecomment-369696142

postlund commented 3 years ago

I guess this is related to #538?

phw commented 1 year ago

This was also discussed in #559.

Overall regarding the issue here the proposed patch by @postlund looks mostly good for read support. Encoding support could be improved maybe.

Also this could be extended with some ideas from https://github.com/metabrainz/picard/blob/master/picard/formats/wav.py#L39-L219 for more supported tags and write support.

StealthyExpertX commented 1 year ago

Is .wav support a thing? If so what is the correct usage import for using mutagen as a module?

I wanted to be able to do something like "from mutagen.mp3 import MP3" but the wav equivalent.

phw commented 1 year ago

@StealthyExpertX WAVE in general is supported. The easiest way to load a file into mutagen independent of format is:

from mutagen import File

file_path = "./test.wav"
f = File(file_path)

For WAVE f will be of type mutagen.wave.WAVE. You can of course also directly instantiate this class. See also the documentation at https://mutagen.readthedocs.io/en/latest/api/wave.html .

Regarding tags mutagen supports ID3 tags embedded as a RIFF chunk. This is also supported by some media players, e.g. foobar2000, Media Monkey and MP3Tag can read and write those tags. See also the table further up in this ticket.

Not supported right now is reading / writing tags inside an INFO chunk, as e.g. supported by Windows. Adding support for this is what this issue here is about.

It's a rather limited format (rather restricted set of tags, no support for different character encodings), but often the only tagging supported for WAVE files.