dbry / WavPack

WavPack encode/decode library, command-line programs, and several plugins
BSD 3-Clause "New" or "Revised" License
355 stars 65 forks source link

'--import-id3' does not import all (potentially) compatible tags #69

Closed remenor closed 5 months ago

remenor commented 5 years ago

Not all "equivalent" tags are imported by compressing dsf (id3v2.3) to wavpack (ape). It is always possible to make a script with wvtag but this causes the function "--import-id3" to be wasted. Example (there are 9 tags and only 7 are imported)

`metadsf -t 01* TIT2=Partita No. 1: I. Praeludium TPE1=Glenn Gould TALB=Bach: Partitas (Complete) (SACD1) TRCK=1 TCOM=Johann Sebastian Bach TPUB=Sony Music TCON=Barroco COMM=COMMENT TDRC=1959

wavpack --import-id3 01*

WAVPACK Hybrid Lossless Audio Compressor Linux Version 5.1.0 Copyright (c) 1998 - 2017 David Bryant. All Rights Reserved.

successfully imported 7 items from ID3v2 tag

created 01 - Partita No. 1_ I. Praeludium.wv in 6.60 secs (lossless, 46.61%)

wvtag -l *.wv

WVTAG WavPack Metadata Tagging Utility Linux Version 5.1.0 Copyright (c) 2017 David Bryant. All Rights Reserved.

APEv2 tag items: 7 (270 bytes used) Title: Partita No. 1: I. Praeludium Artist: Glenn Gould Album: Bach: Partitas (Complete) (SACD1) Track: 1 Composer: Johann Sebastian Bach Genre: Barroco Year: 1959`

In this case the tags are missing: COMM > Comment TPUB > Publisher

These are the ones I usually use. It may be that there are others that have been lost Thanks!

dbry commented 5 years ago

Hi, and thanks for reporting this!

I think that I didn't include TPUB because it only listed for ID3v2.4, but it's certainly harmless to put in (and I just did).

The COMM thing is a little more complicated because it has a different syntax than all the others, but I'll take a look (should not be too bad).

One thing weird is that your example shows TDRC which is also a ID3v2.4 thing (with a rigidly defined format, I think), but I have never handled that one, so I'm not sure how it got read. Perhaps metadsf showed it when it was really TYER?

Anyway, thanks again!

ArminiusTux commented 4 years ago

Greetings,

adding support for ID3 v2.4 tags from DSF files would be welcomed by users of this tool. @EuFlo has revamped the ID3 tagging for sacd_extract's DSF output into:

  • (id3tag=1) in ID3V2.3 version with full metadata. String encodings is UTF-16 with BOM (UCS-2);
  • (id3tag=2) in ID3V2.3 version with minimal medatadata*. String encodings is UTF-16 with BOM (UCS-2);
  • (id3tag=3) in ID3V2.3 version with full metadata. String encodings is ISO-8859-1 (ASCII);
  • (id3tag=4) in ID3V2.4 version with full metadata. String encodings is UTF-8;
  • (id3tag=5) in ID3V2.4 version with minimal metadata*. String encodings is UTF-8;
  • (id3tag=0) ID3tag will not be created (for testing and experimental purpose).

ID3-tags created by mode no. 1, 2 or 3 are successfully imported by wavpack

DSF files tagged by mode 4 (=default) & 5 cause an import failure, with a different severity by version:

conversion fails - no file created: wavpack -f -v -m --import-id3 track.dsf

WAVPACK Hybrid Lossless Audio Compressor Linux Version 5.1.0 Copyright (c) 1998 - 2017 David Bryant. All Rights Reserved.

ID3v2 import: not valid ID3v2.3

conversion OK, ID3tag is ignored and not imported to APEv2 in wv output:

./wavpack -f -v -m --import-id3 track.dsf 
 WARNING: WAVPACK using libwavpack version 5.3.2, expected 5.2.0 (see README)
original md5 signature: bf9f26894b6cb683719f0381bf513dba                                
created (and verified) track.wv in 19.51 secs (lossless, 36.22%) `
dbry commented 4 years ago

@ArminiusTux Thanks for letting me know about this. I would suggest that these users simply use id3tag=1 if they plan on compressing with WavPack (and in general, support for ID3v2.3 is more universal than ID3v2.4). I noticed that the default format was changed to id3tag=4, but I'm not sure what the reason for that was unless it's the shorter tags to avoid glitches in defective players. There really isn't any other advantage for ID3v2.4 as far as I can tell.

When I allowed ID3v2 tags to be read from all files (instead of just DSF) I changed the error behavior for finding a missing or bad tag (which is what ID3v2.4 is, at least for now).

I have worked on adding ID3v2.2 support based on some old files I have, and I may look into adding ID3v2.4 support, but I'm not sure when that will be. The ID3 spec is not that clear and the formats sometimes seem arbitrarily different. In fact, the one patch I made to sacd_extract was to fix a bug related to a misinterpretation of the ID3v2 spec, and I still have a hack in my ID3v2.3 parser to accept those bad tags!

BTW, I noticed that you are getting the warning about the libwavpack library version not matching the application. I run into this often too when testing out various versions, and have found that the best way to avoid it is making static builds (./configure --enable-static --disable-shared). When you get that error the programs will probably still work, but nothing is guaranteed.

EuFlo commented 4 years ago

Dear David, The reasons I changed to ID3v2.4 format as a default format are:

dbry commented 4 years ago

Hi @EuFlo , thanks for responding, and thanks for your work on sacd_extract!

Interesting about the Oppo. I have verified that my BDP-103 also does not correctly handle ISO-8859-1 encoding; the codes above 0x80 show as Asian characters. That's a bug.

However, I don't see any difference between ID3v2.3 + UTF-16 and ID3v2.4 + UTF-8 on my player, and that makes sense because the supported characters are the same in either encoding and it's a trivial function to convert back and forth (no special library required).

As for your other reasons I don't feel any of them really apply here. The syncsafe feature is only to prevent problems with MPEG parsers; it shouldn't make any difference with DSF because the placement of the entire tag is known and I can't imagine a DSF player looking for MPEG headers. Western text is smaller with UTF-8 but Asian text is smaller with UTF-16, but in either case the size of a few text strings compared to the size of a typical DSD file is nothing. I understand that UTF-8 is used in Linux and is first mentioned in the ID3v2.4 spec, but like I said above UTF-8 and UTF-16 are really pretty easy to interchange.

My other hesitation about ID3v2.4 is summarized nicely in this post on stack overflow, and even the Wikipedia page discusses the problems with ID3v2.4 (such as the broken Windows support). But as long as you continue to allow users to write ID3v2.3 with UTF-16, WavPack will still work correctly.

Thanks again!

evan314159 commented 2 years ago

A tangible advantage to ID3 2.4 is that it standardises storage of multiple values in a tag in an unambiguous way, albeit one that can have sharp edges for other reasons. In version 2.4, the separator is globally defined as null, whereas in version 2.3 the separator (where defined) is "/". Consider the impact for e.g. artist = AC/DC.

evan314159 commented 2 years ago

It might be worth using a standard ID3 processing library so you don't need to implement all the wonders of ID3 yourself. I noted when I tried to convert my DSF library to WV that the Artist tags were not processed correctly, partly because it looks like MP3Tag uses null separation even on ID3 2.3 files and wavpack doesn't seem to find any Artist as a result, but even if that worked wavpack --import-id3 does not support multi-value Artists.

evan314159 commented 2 years ago

I spoke with the MP3Tag author and use of null in ID3 2.3 tags is intentional to avoid the AC/DC problem, and apparently well-supported by applications generally, which just reinforces the mess that is ID3 but also that ID3 2.4 support is worthwhile.

dbry commented 2 years ago

Thanks for the comments @evan314159. In the time since this thread was opened I have decided to go ahead and implement ID3v2.4 in the next release, along with AIFF file support. It shouldn't be too big of a deal and I'd much rather not create another library dependency, especially after all the work I put into v2.3. I just need to make sure and do a lot of fuzz testing...

dbry commented 1 year ago

https://github.com/dbry/WavPack/commit/1bca05f36863a13c8d3f1d5579b2fc25959b15c9

dbry commented 1 year ago

@remenor @EuFlo @ArminiusTux

I know it took a long time, but I finally added ID3v2.4 tag import support to WavPack and have created a set of evaluation Windows binaries and posted them on HA (of course on Linux it's easiest to just build from source). Any testing or comments are greatly appreciated.

Hydrogen Audio thread with binaries