mkb79 / audible-cli

A command line interface for audible package. With the cli you can download your Audible books, cover, chapter files.
GNU Affero General Public License v3.0
423 stars 45 forks source link

Fix badly encoded characters in metadata #177

Closed vwkd closed 3 months ago

vwkd commented 6 months ago

This is an attempt to fix badly encoded characters in the AAX/C metadata. I'm not sure what encoding the AAX/C format uses for metadata and what badly encoded characters Audible has throughout its library. Hence this is currently a limited "find and replace" for those characters I've encountered. Please feel free to add more if you find them.

Specifically, this currently fixes:

EDIT: This doesn't seem to work yet. Not sure why. The updated metadata should be written to the temporary metadata file. I suspected that ffmpeg, the .m4b format or the file metadata doesn't support Unicode, but one of my audiobooks has the correct copyright character already, which suggests this should not be the issue.

mkb79 commented 6 months ago

The metadata are written back using utf-8. Maybe this is the wrong encoding. I'll check these and report back.

FYI: The metadata extracted using ffmpeg does not contain the full metadata. If you compare the output from ffmpeg -i {AAXC-FILE} -f ffmetadata meta-ffmetadata.txt and mediainfo {AAXC-FILE} > meta-mediainfo.txt you can see the difference.

vwkd commented 3 months ago

I haven’t figured out how to make it work yet.

Also, I’m thinking it may not be a good idea to start fixing Audible‘s mistakes as it will lead to ever increasing complexity without fixing the root cause. Instead, the right thing is to report the mistakes to Audible until they implement a fix at the source. Alternatively, one might choose to accept these imperfections of Audible.