advplyr / audiobookshelf

Self-hosted audiobook and podcast server
https://audiobookshelf.org
GNU General Public License v3.0
6.69k stars 472 forks source link

[Bug] Embed audio file not embedding ASIN, series and series sequence #1794

Closed jeff47 closed 1 year ago

jeff47 commented 1 year ago

Describe the feature/enhancement

I have a number of files where the asin tag is present, but when I go into Match, it defaults to the Title instead. It seems more logical to use the ASIN first, if it exists, and then use the title as a fallback.

advplyr commented 1 year ago

This should be how it already works. What is your provider set to?

advplyr commented 1 year ago

I just tested again and this is working, but let me know how I can reproduce it. image

jeff47 commented 1 year ago

Audible.

I've attached an example. Tone finds the ASIN in the m4b file, but the ASIN doesn't seem to get parsed by Audiobookshelf.

Screen Shot 2023-05-25 at 5 04 04 PM Screen Shot 2023-05-25 at 5 04 15 PM
tone dump "Winfrey, Oprah - The Path Made Clear.m4b" 
── /media/audiobooks/Winfrey, Oprah/The Path Made Clear/Winfrey, Oprah - The Path Made Clear.m4b ─────────────────────────────────

                   properties                    
┌───────────────────┬───────────────────────────┐
│            format │ MPEG-4 Part 14: audio/mp4 │
│           bitrate │ 76                        │
│       sample-rate │ 44100                     │
│          duration │ 02:55:09.974              │
│               vbr │ True                      │
│          channels │ 2 (Stereo (2/0.0))        │
│ embedded pictures │ 1                         │
│     1 meta format │ Native / MPEG-4           │
└───────────────────┴───────────────────────────┘
                            embedded pictures                             
┌───────────────────┬─────────┬─────────────┬────────────┬───────────────┐
│          position │ type    │ description │ mimetype   │ size          │
├───────────────────┼─────────┼─────────────┼────────────┼───────────────┤
│                 1 │ Generic │             │ image/jpeg │ 1259796 bytes │
└───────────────────┴─────────┴─────────────┴────────────┴───────────────┘
                                                        metadata                                                        
┌───────────────────┬──────────────────────────────────────────────────────────────────────────────────────────────────┐
│             genre │ Biographies & Memoirs/Politics & Social Sciences/Relationships, Parenting & Personal Development │
│            artist │ Oprah Winfrey                                                                                    │
│      album-artist │ Oprah Winfrey                                                                                    │
│          narrator │ Oprah Winfrey, full cast                                                                         │
│          composer │ Oprah Winfrey, full cast                                                                         │
│         publisher │ Macmillan Audio                                                                                  │
│             album │ The Path Made Clear                                                                              │
│             title │ The Path Made Clear                                                                              │
│          subtitle │ Discovering Your Life's Direction and Purpose                                                    │
│      track-number │ 1                                                                                                │
│       track-total │ 1                                                                                                │
│   publishing-date │ 01/01/2019                                                                                       │
│     encoding-tool │ Lavf58.45.100                                                                                    │
│ itunes-media-type │ 2 (Audiobook)                                                                                    │
│   itunes-play-gap │ 1 (NoGap)                                                                                        │
└───────────────────┴──────────────────────────────────────────────────────────────────────────────────────────────────┘
    additional metadata fields    
┌───────────────────┬────────────┐
│              asin │ 1250317029 │
└───────────────────┴────────────┘
advplyr commented 1 year ago

Abs uses ffprobe to get the meta tags from audio files. Can you try ffprobe?

jeff47 commented 1 year ago

Ah, I assumed since tone was used to embed the metadata, it was used to read it as well! My mistake.

ffprobe doesn't reveal any ASIN tag set.

However, I don't see an ASIN in other files using ffprobe where ABS does have them set. Do I need a different parameter to extract them?

ffprobe "Winfrey, Oprah - The Path Made Clear.m4b" 
ffprobe version 5.1.2-3ubuntu1 Copyright (c) 2007-2022 the FFmpeg developers
  built with gcc 12 (Ubuntu 12.2.0-14ubuntu2)
  configuration: --prefix=/usr --extra-version=3ubuntu1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libglslang --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librist --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --disable-sndio --enable-libjxl --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-libplacebo --enable-librav1e --enable-shared
  libavutil      57. 28.100 / 57. 28.100
  libavcodec     59. 37.100 / 59. 37.100
  libavformat    59. 27.100 / 59. 27.100
  libavdevice    59.  7.100 / 59.  7.100
  libavfilter     8. 44.100 /  8. 44.100
  libswscale      6.  7.100 /  6.  7.100
  libswresample   4.  7.100 /  4.  7.100
  libpostproc    56.  6.100 / 56.  6.100
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55aba35ab9c0] stream 0, timescale not set
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'Winfrey, Oprah - The Path Made Clear.m4b':
  Metadata:
    major_brand     : M4A 
    minor_version   : 512
    compatible_brands: M4A isomiso2
    title           : The Path Made Clear
    artist          : Oprah Winfrey
    album           : The Path Made Clear
    composer        : Oprah Winfrey, full cast
    comment         : "Fans of Oprah Winfrey's TV show, or more recently her two podcasts, will have this title on their playlist for months." — AudioFile Magazine, Earphones Award winner  Everyone has a purpose. And, according to Oprah Winfrey, “Your real job in life is 
    genre           : Biographies & Memoirs/Politics & Social Sciences/Relationships, Parenting & Personal Development
    description     : "Fans of Oprah Winfrey's TV show, or more recently her two podcasts, will have this title on their playlist for months." — AudioFile Magazine, Earphones Award winner  Everyone has a purpose. And, according to Oprah Winfrey, “Your real job in life is 
    album_artist    : Oprah Winfrey
    track           : 1/1
    encoder         : Lavf58.45.100
    media_type      : 2
    SUBTITLE        : Discovering Your Life's Direction and Purpose
    gapless_playback: 1
  Duration: 02:55:09.97, start: 0.000000, bitrate: 79 kb/s
  Chapters:
    Chapter #0:0: start 0.000000, end 51.744000
      Metadata:
        title           : Opening Credits
    Chapter #0:1: start 51.744000, end 294.010000
      Metadata:
        title           : Introduction
    Chapter #0:2: start 294.010000, end 1076.940000
      Metadata:
        title           : Chapter 1: The Seeds
    Chapter #0:3: start 1076.940000, end 2140.228000
      Metadata:
        title           : Chapter 2: The Roots
    Chapter #0:4: start 2140.228000, end 3186.612000
      Metadata:
        title           : Chapter 3: The Whispers
    Chapter #0:5: start 3186.612000, end 4367.996000
      Metadata:
        title           : Chapter 4: The Clouds
    Chapter #0:6: start 4367.996000, end 5684.474000
      Metadata:
        title           : Chapter 5: The Map
    Chapter #0:7: start 5684.474000, end 6309.276000
      Metadata:
        title           : Chapter 6: The Road
    Chapter #0:8: start 6309.276000, end 7369.731000
      Metadata:
        title           : Chapter 7: The Climb
    Chapter #0:9: start 7369.731000, end 8298.947000
      Metadata:
        title           : Chapter 8: The Give
    Chapter #0:10: start 8298.947000, end 9382.065000
      Metadata:
        title           : Chapter 9: The Reward
    Chapter #0:11: start 9382.065000, end 10335.197000
      Metadata:
        title           : Chapter 10: Home
    Chapter #0:12: start 10335.197000, end 10461.000000
      Metadata:
        title           : Epilogue
    Chapter #0:13: start 10461.000000, end 10509.000000
      Metadata:
        title           : End Credits
  Stream #0:0[0x1](und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 75 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
      vendor_id       : [0][0][0][0]
  Stream #0:1[0x2](und): Data: bin_data (text / 0x74786574) (default)
    Metadata:
      creation_time   : 2023-05-23T21:00:46.000000Z
      handler_name    : Chapter titles
  Stream #0:2[0x0]: Video: mjpeg (Baseline), yuvj420p(pc, bt470bg/unknown/unknown), 2400x2400 [SAR 1:1 DAR 1:1], 90k tbr, 90k tbn (attached pic)
Unsupported codec with id 98314 for input stream 1
advplyr commented 1 year ago

I'm not sure how tone is setting the asin tag, maybe @sandreas could help out. In this comment he had AUDIBLE_ASIN picked up by ffprobe. https://github.com/advplyr/audiobookshelf/issues/787#issuecomment-1532282307

jeff47 commented 1 year ago

Hmm. Using that ffprobe commandline is helpful. Even after matching with an ASIN, and seeing it in ABS on the book info page and in the tone dump, I do not see it in ffprobe. I do not believe ffprobe supports those tags.

ffprobe -v quiet -print_format json -show_format "Winfrey, Oprah - The Path Made Clear.m4b" 
{
    "format": {
        "filename": "Winfrey, Oprah - The Path Made Clear.m4b",
        "nb_streams": 3,
        "nb_programs": 0,
        "format_name": "mov,mp4,m4a,3gp,3g2,mj2",
        "format_long_name": "QuickTime / MOV",
        "start_time": "0.000000",
        "duration": "10509.974000",
        "size": "104260634",
        "bit_rate": "79361",
        "probe_score": 100,
        "tags": {
            "major_brand": "M4A ",
            "minor_version": "512",
            "compatible_brands": "M4A isomiso2",
            "title": "The Path Made Clear",
            "artist": "Oprah Winfrey",
            "album": "The Path Made Clear",
            "composer": "Oprah Winfrey, full cast",
            "comment": "\"Fans of Oprah Winfrey's TV show, or more recently her two podcasts, will have this title on their playlist for months.\" — AudioFile Magazine, Earphones Award winner  Everyone has a purpose. And, according to Oprah Winfrey, “Your real job in life is to figure out as soon as possible what that is, who you are meant to be, and begin to honor your calling in the best way possible.” That journey starts right here. In her latest audiobook, The Path Made Clear, Oprah shares what she sees as a guide for activating your deepest vision of yourself, offering the framework for creating not just a life of success, but one of significance. The audiobook’s 10 chapters are organized to help you recognize the important milestones along the road to self-discovery, laying out what you really need in order to achieve personal contentment and what life’s detours are there to teach us. Oprah opens each chapter by sharing her own key lessons and the personal stories that helped set the course for her best life. She then brings together wisdom and insights from luminaries in a wide array of fields, inspiring listeners to consider what they’re meant to do in the world and how to pursue it with passion and focus. These renowned figures share the greatest lessons from their own journeys toward a life filled with purpose. The Path Made Clear provides listeners with a valuable resource for achieving a life lived in service of your calling - whatever it may be. This program is read by Adyashanti, Alanis Morrissette, Amy Purdy, Barbara Brown Taylor, Bishop T. D. Jakes, Brene Brown, Brian Grazer, Brother David Steindl-Rast, Bryan Stevenson, Carole Bayer Sager, Caroline Myss, Charles Eisenstein, Cheryl Strayed, Cicely Tyson, Cindy Crawford, Dani Shapiro, Daniel Pink, David Brooks, Debbie Ford, Deepak Chopra, Dr. Shefali Tsabary, Eckhart Tolle, Elizabeth Gilbert, Elizabeth Lesser, Ellen Degeneres, Fr. Richard Rohr, Gabrielle Bernstein, Gary Zukav, Glennon Doyle, Goldie Hawn, India.Arie, Iyanla Vanzant, Jack Canfield, Jane Fonda, Janet Mock, Jay-Z, Jean Houston, Jeff Weiner, Vice President Joe Biden, Joel Osteen, US Congressman John Lewis, Jon Bon Jovi, Jon Kabat-Zinn, Jordan Peele, Kerry Washington, Lin-Manuel Miranda, Lynne Twist, Marianne Williamson, Mark Nepo, Michael Bernard Beckwith, Michael Singer, Mindy Kaling, Mitch Albom, Nate Berkus, Pastor A. R. Bernard, Pema Chodron, President Jimmy Carter, Rev. Ed Bacon, Rob Bell, Robin Roberts, RuPaul Charles, Sarah Ban Breathnach, Shauna Niequist, Shawn Achor, Shonda Rhimes, Sidney Poitier, Sister Joan Chittister, Stephen Colbert, Sue Monk Kidd, T. D. Jakes, Thich Nhat Hanh, Thomas Moore, Tim Storey, Tracey Jackson, Tracy McMillan, Tracy Morgan, Trevor Noah, Wes Moore, William Paul Young, and Wintley Phipps. ",
            "genre": "Biographies & Memoirs/Politics & Social Sciences/Relationships, Parenting & Personal Development",
            "description": "\"Fans of Oprah Winfrey's TV show, or more recently her two podcasts, will have this title on their playlist for months.\" — AudioFile Magazine, Earphones Award winner  Everyone has a purpose. And, according to Oprah Winfrey, “Your real job in life is to figure out as soon as possible what that is, who you are meant to be, and begin to honor your calling in the best way possible.” That journey starts right here. In her latest audiobook, The Path Made Clear, Oprah shares what she sees as a guide for activating your deepest vision of yourself, offering the framework for creating not just a life of success, but one of significance. The audiobook’s 10 chapters are organized to help you recognize the important milestones along the road to self-discovery, laying out what you really need in order to achieve personal contentment and what life’s detours are there to teach us. Oprah opens each chapter by sharing her own key lessons and the personal stories that helped set the course for her best life. She then brings together wisdom and insights from luminaries in a wide array of fields, inspiring listeners to consider what they’re meant to do in the world and how to pursue it with passion and focus. These renowned figures share the greatest lessons from their own journeys toward a life filled with purpose. The Path Made Clear provides listeners with a valuable resource for achieving a life lived in service of your calling - whatever it may be. This program is read by Adyashanti, Alanis Morrissette, Amy Purdy, Barbara Brown Taylor, Bishop T. D. Jakes, Brene Brown, Brian Grazer, Brother David Steindl-Rast, Bryan Stevenson, Carole Bayer Sager, Caroline Myss, Charles Eisenstein, Cheryl Strayed, Cicely Tyson, Cindy Crawford, Dani Shapiro, Daniel Pink, David Brooks, Debbie Ford, Deepak Chopra, Dr. Shefali Tsabary, Eckhart Tolle, Elizabeth Gilbert, Elizabeth Lesser, Ellen Degeneres, Fr. Richard Rohr, Gabrielle Bernstein, Gary Zukav, Glennon Doyle, Goldie Hawn, India.Arie, Iyanla Vanzant, Jack Canfield, Jane Fonda, Janet Mock, Jay-Z, Jean Houston, Jeff Weiner, Vice President Joe Biden, Joel Osteen, US Congressman John Lewis, Jon Bon Jovi, Jon Kabat-Zinn, Jordan Peele, Kerry Washington, Lin-Manuel Miranda, Lynne Twist, Marianne Williamson, Mark Nepo, Michael Bernard Beckwith, Michael Singer, Mindy Kaling, Mitch Albom, Nate Berkus, Pastor A. R. Bernard, Pema Chodron, President Jimmy Carter, Rev. Ed Bacon, Rob Bell, Robin Roberts, RuPaul Charles, Sarah Ban Breathnach, Shauna Niequist, Shawn Achor, Shonda Rhimes, Sidney Poitier, Sister Joan Chittister, Stephen Colbert, Sue Monk Kidd, T. D. Jakes, Thich Nhat Hanh, Thomas Moore, Tim Storey, Tracey Jackson, Tracy McMillan, Tracy Morgan, Trevor Noah, Wes Moore, William Paul Young, and Wintley Phipps. ",
            "album_artist": "Oprah Winfrey",
            "track": "1/1",
            "encoder": "Lavf58.45.100",
            "media_type": "2",
            "SUBTITLE": "Discovering Your Life's Direction and Purpose",
            "gapless_playback": "1"
        }
    }
}
sandreas commented 1 year ago

I'm not sure how tone is setting the asin tag, maybe @sandreas could help out. In this comment he had AUDIBLE_ASIN picked up by ffprobe. https://github.com/advplyr/audiobookshelf/issues/787#issuecomment-1532282307

@advplyr @jeff47

tone does not set the field ASIN automatically... Depending on which format you use (e.g. mp3 or m4a), you have to use AdditionalFields for this, see here: https://github.com/sandreas/tone/blob/c3ab8c49019afd6f057a55fa5b6ac0ecb22ee789/tone/Metadata/Taggers/M4BFillUpTagger.cs#L37

You can do this with custom JavaScript taggers as well.

I personally use metadata.AdditionalFields["----:com.pilabor.tone:AUDIBLE_ASIN"] for my very own purposes, which ffprobe detects as AUDIBLE_ASIN(without the ----:com.pilabor.tone: prefix). I don't use ASIN, because it is a term for Amazon items. While Audible uses the same format for its ID, the ASIN on Audible and Amazon are not identical for the same product (I think this is to prevent scraping the websites). Therefore I used AUDIBLE_ASIN to not pollute the ASIN tag.

jeff47 commented 1 year ago

I personally use metadata.AdditionalFields["----:com.pilabor.tone:AUDIBLE_ASIN"] for my very own purposes, which ffprobe detects as AUDIBLE_ASIN(without the ----:com.pilabor.tone: prefix). I don't use ASIN, because it is a term for Amazon items. While Audible uses the same format for its ID, the ASIN on Audible and Amazon are not identical for the same product (I think this is to prevent scraping the websites). Therefore I used AUDIBLE_ASIN to not pollute the ASIN tag.

I don't see AUDIBLE_ASIN in the ffprobe output for files I've used ABS to embed metadata, and if I remove a properly tagged file from ABS and reimport it, ABS does not populate the ASIN field even when tone displays it on the cli.

That's the basis for my question -- is this the expected behavior? I would have assumed that if an ASIN tag was found in the metadata for a file, it would be used to improve the match and in my experience, this is not happening.

sandreas commented 1 year ago

I don't see AUDIBLE_ASIN in the ffprobe output for files I've used ABS to embed metadata, and if I remove a properly tagged file from ABS and reimport it, ABS does not populate the ASIN field even when tone displays it on the cli.

@jeff47 That may happen, if ABS tries to tag files via tone AdditionalFields, but ffprobe is not able to show it.

tone is still able to show tags because they are there, ffprobe does NOT show, but after reimporting the ABS cache may be gone and the tag cannot be determined since ffprobe is used.

@advplyr So this MAY be a problem in ABS, but you have to investigate this... If it is something in tone feel free to open an issue.

advplyr commented 1 year ago

I had to make some updates to get embed tool to work for ASIN, series and series sequence

For mpeg4 audio files it will use ----:com.pilabor.tone:SERIES for series name ----:com.pilabor.tone:PART for series sequence ----:com.pilabor.tone:AUDIBLE_ASIN for asin

I tested this on an m4a and was able to properly embed those and the scanner read them.

I also added an additional check for if series sequence is an integer. If it is not an integer then MP3 files will add PART in additionalFields. Tested all of these on MP3 as well.

advplyr commented 1 year ago

Fixed in v2.2.21

jeff47 commented 1 month ago

It's been a while since I checked on this, but I don't think the audible_asin tag is being embedded. I don't see it showing up when I check with "tone dump" after doing the quick embed function.

Are all the metadata fields to be embedded listed on this page? audible_asin doesn't show up there, although it is present on the main book page.

Screen Shot 2024-09-10 at 8 40 18 PM