advplyr / audiobookshelf

Self-hosted audiobook and podcast server
https://audiobookshelf.org
GNU General Public License v3.0
6.6k stars 466 forks source link

[Enhancement]: Better Ebook handling and conversion #2467

Open FlyinPancake opened 10 months ago

FlyinPancake commented 10 months ago

Describe the feature/enhancement

Hi!

I really like this project, and would like to ditch calibre-web from my stack.

abs works splendidly for audiobooks, and it could be wonderful for ebooks as well!

For my use-case, custom metadata providers are a must. My friend and I are developing one, that would be compatible with abs, we would like to implement it as a REST API which exposes the metadata required for abs. We would like to collaborate on an OpenApi spec for this REST API, which then could be used as a custom provider. As we are coming up with this spec, I will link it here. This should be really simple, as this would be a generalized provider, that could be specified by the user.

I love abs's send to ereader ability, but as Kindle dropped support for mobi it is tedious to manually convert and upload each ebook as epub. For this problem ebook-converter could be integrated simirarly to tone. The second issue is filesize for email attachments, as right now abs tries to send even 40mb PDFs as attachments, that inevitably fail. Kindle supports .zip files for this reason, and this functionality could be incorporated as well.

I know that this is a community-run project, and I am ready to be a part of development efforts (despite my laughable JS skills). advplyr made a great tool for us data hoarders to admire and manage our piles of data and I am greatful for its existance <3

This feature request is a big one and serves more like a wishlist than actual requests. Thanks for coming to my TED talk

advplyr commented 9 months ago

Hey, I would be interested in hearing more about the custom metadata provider you are working on. Are you aggregating metadata from different sources or putting together your own metadata? Are you hosting this publicly so it could be integrated as a metadata provider for everyone in Abs?

Ebook converter is outside the scope of Abs right now but maybe sometime in the future or as a plugin. Just as an aside, Tone is going to be removed in an upcoming version so the only external dependencies we will have to deal with is ffmpeg/ffprobe.

We can definitely support zipping large ebooks before sending to kindle. We would also need to check if the zipped file still exceeds 50MB

FlyinPancake commented 9 months ago

Hi!

We are currently working on our API, and then will we start tackling the integration with ABS. The provider is currently hosted in a private repo, as the source vert much forbids scraping. We developed a scraper + cache setup to not get IP banned. I plan on open souring the cache and API part, but keeping the scraper private to not get taken down. That way anyone who wants a custom metadata source, can implement a single rust trait and have the rest taken care of.

We are still iterating on the API spec, as of now we are closest to audible, but we'd love to get your input on what would be best for this. Our current solution is a single EP that returns the matches for a query. Currently a match object looks like this:

{
    "title": "Six of Crows",
    "subtitle": null,
    "author": "Leigh Bardugo",
    "narrator": null,
    "publisher": "Henry Holt",
    "published_year": "2015",
    "description": "Ketterdam: a bustling hub of international trade where anything can be had for the right price--and no one knows that better than criminal prodigy Kaz Brekker. Kaz is offered a chance at a deadly heist that could make him rich beyond his wildest dreams. But he can't pull it off alone…\nA convict with a thirst for revenge.\nA sharpshooter who can't walk away from a wager.\nA runaway with a privileged past.\nA spy known as the Wraith.\nA Heartrender using her magic to survive the slums. \nA thief with a gift for unlikely escapes. \nSix dangerous outcasts. One impossible heist. Kaz's crew is the only thing that might stand between the world and destruction—if they don't kill each other first.",
    "cover": "PROVIDER_BASEURL/system/covers/big/covers_337930.jpg?1423850927",
    "isbn": null,
    "asin": "B00UG9LC4I",
    "genres": null,
    "tags": [],
    "language": null,
    "duration": null
}

Of course, returning a list of narrators and authors would be nice if it is easy to implement in ABS. We will add series to the object too, it's just not yet implemented.

To sum it up

As for public hosting, I am not comfortable hosting it, since it clearly violates the EULA for the provider, however the eventual OSS base could be hosted as a sidecar for ABS, hence our goal for a backend-agnostic custom metadata provider + API


The book converter function would be really handy for me, if there is a plugin system, I'd be happy to help, but as I said before, I am not the go-to guy for frontend and JS.


Just a note, that the filesize should be configurable, as many email providers limit the attachment size

FlyinPancake commented 9 months ago

Here is the current OpenAPI schema:

openapi: 3.0.0
servers: 
  - url: https://example.com
    description: Local server 
info:
  license: 
    name: MIT
    url: https://opensource.org/licenses/MIT

  title: Custom Metadata Provider
  version: 0.1.0
security:
  - api_key: []

paths:
  /search:
    get:
      description: Search for books
      operationId: search
      summary: Search for books
      security: 
        -  api_key: []
      parameters:
        - name: query
          in: query
          required: true
          schema:
            type: string
        - name: author
          in: query
          required: false
          schema:
            type: string
      responses:
        "200":
          description: OK
          content:
            application/json:
              schema:
                type: object
                properties:
                  matches:
                    type: array
                    items:
                      $ref: "#/components/schemas/BookMetadata"
        "400":
          description: Bad Request
          content:
            application/json:
              schema:
                type: object
                properties:
                  error:
                    type: string
        "401":
          description: Unauthorized
          content:
            application/json:
              schema:
                type: object
                properties:
                  error:
                    type: string
        "500":
          description: Internal Server Error
          content:
            application/json:
              schema:
                type: object
                properties:
                  error:
                    type: string
components:
  schemas:
    BookMetadata:
      type: object
      properties:
        title:
          type: string
        subtitle:
          type: string
        author:
          type: string
        narrator:
          type: string
        publisher:
          type: string
        published_year:
          type: string
        description:
          type: string
        cover:
          type: string
          description: URL to the cover image
        isbn:
          type: string
          format: isbn
        asin:
          type: string
          format: asin
        genres:
          type: array
          items:
            type: string
        tags:
          type: array
          items:
            type: string
        language:
          type: string
        duration:
          type: number
          format: int64
          description: Duration in seconds
      required: 
        -  title
  securitySchemes:
    api_key:
      type: apiKey
      name: AUTHORIZATION
      in: header
FlyinPancake commented 9 months ago

We began work on adding CMP to the UI in my fork over in my fork

Dewyer commented 9 months ago

I opened a pull request, with some code, it implements the basic version for the UI for managing the providers, and a very basic implementation.