joalla / discogs_client

Continuation of the "Official Python Client for the Discogs API"
https://python3-discogs-client.readthedocs.io
Other
307 stars 50 forks source link

Documentation Clarification Request: Master vs Main Release vs Release vs Track #125

Open dsm-72 opened 1 year ago

dsm-72 commented 1 year ago

Hey! I am new to the discogs_client and api. I am trying to learn how to use it to wrangle together a discography information. Fundamentally I would like to get just all songs from a given artist, but I have, through deezer, spotipy, lyricsgenius, ytmusicapi and BeautifulSoup quickly learned that oof this data is both messy and not standardized very well no matter where I look e.g. (Trackname about Thing vs Trackname About Thing vs Trackname (About Thing) vs Trackname - About Thing) and while this is can be mitigated through throughly scrubbing and normalizing names, it gets hard as not everyone separates what is a remix, cover, alternate version, extended version etc. All of this makes mapping songs to albums and even finding the base version of the song even more difficult, especially when combining apis.

I think Discogs has a nice approach. To my understanding (from the official API documentation)

Release The Release resource represents a particular physical or digital object released by one or more Artists.

Master Release The Master resource represents a set of similar Releases. Masters (also known as "master releases") have a "main release" which is often the chronologically earliest.

Master Release Versions Retrieves a list of all Releases that are versions of this master. Accepts Pagination parameters.

Artist Releases Returns a list of Releases and Masters associated with the Artist.

So regardless of EP, Single, or Album, if a generic entity was released for the first time a master version gets created which represents that unique entity, to which a "main version" is immediately created and tethered representing the first release (and subsequently when the master gets new versions they appear as part of the versions list). Accordingly, a single (or a standalone track) , even if it is a master, has a tracklist corresponding to list of one containing track data (which is mostly empty). Notably track objects (in your api) do not have an id property

So my question, if I want to get the full discography from discogs the process should look something like this?

  1. Get all artist alias objects. Artist aliases have a different id and return a different set of releases!
  2. Get all releases (master or otherwise) for all artist aliases (main included)
  3. Recursively follow all release links (master --> main_release, master --> versions) to make sure we aren't missing any (and that discogs doesn't accidentally have a missed entries in the DB) and keep track of the link tree
  4. Create a messy graph mapping "master" tracks to all their versions (including main release) each of which has 1 or more album/releases?

e.g.

| Track | Version | Album | Artists |
|--------------------------------|
| T1      |    T1v1   |    EP     | A1        |
| T1      |    T1v1   |    EP     | A2        |
| T1      |    T1v1   |   Alb1   | A1        |
| T1      |    T1v1   |   Alb1   | A2       |
| T1      |   T1ext  |    Alb1    | A1        |
| T1      |   T1ext  |    Alb1    | A2        |

So in this example there a "core" track T1 that has two versions T1v1 and T1ext, which has two artists and is on two different albums.

Question How to tell if a release is a single vs an EP? I have a master track that has a trackless of just one. Do I just compare titles? Or do I look at release.formats

[{'name': 'CDr', 'qty': '1', 'descriptions': ['Single', 'Promo']}]

and check if 'Single' is in descriptions?

Not every release object (master / main / version) has a "type" (<-- if master) or "type_" (<--- if track in a trackless).

JOJ0 commented 1 year ago

Hi, I agree that probably Discogs is the only place on digital earth to find a complete discography. There is loads of other useful resources that have their strength of its own but completeness usually is not it ;-)

I'm not sure how I currently could help you here. It might help if you spin up a REPL I could "copypastingly" follow along to learn which information you are looking for and which you don't find or were expecting in places. We constantly try to improve our docs and it's very much appreciated that you try to help us with that :+1:

Fun fact from my end: I thought there is master releases and "regular" releases derived from that. I was not aware of the term "main release". Whoops! Thanks for teaching me that!

JOJ0 commented 1 year ago

trackless is tracklist right?

.format sounds right for finding out what type of release.

doesnt every release have the .format property? does it happen that it is missing/empty? is the reason maybe that data is incomplete on discogs?

not sure about the type object, would need to see code and test. some examples in repl might help. cheers!

dsm-72 commented 1 year ago

@JOJ0 Sure. I am working on a Python Package which combines a few apis (primarily your implementation of Discogs) with some utility functions to try and wrangle all of this together. I think my use case is a bit niche though. I just want

Works (higher level / meta version of a track )             example: "Example Song"
Tracks (implementations of the works)                         example: "Example Song", Example Song Extended, Example Song Remix
Albums (singles/ep inclusive, i.e. how the track was released). example: Example Song EP, Some Album with Example Song
Artist (who write/produce/sang on a track / album)

This is similar to the Master / Main Release vs Releases Discogs has, but not quite as complex.

JOJ0 commented 1 year ago

Hi and sorry for letting you hang there for a couple of weeks. I'm not sure how I can help and you've probably moved on with your interesting project anyway.

What definitely would help the project is if you'd suggest where in our docs we could add information/details you provided here.

I think, and as the title and your first post suggests, you would like to improve the documentation around the terms master release, master release versions, artist release and main version. Where in the docs would you see this information? If we find an appropriate place we could link to the official Discogs API docs for a more detailed description as well as link to the classes we/you found out that provide that information.

Certainly if you'd submit a draft PR as a start would be most helpful but still a pointer to the exact places in the docs where you would like to add it would be tremendously helpful too! I'llo do the additions to the docs then.

Please help us out! Thanks and hope to hearing from you soon! :-)