amugofjava / podcast_search

A simple library providing programmatic access to the iTunes search API for podcasts.
MIT License
41 stars 26 forks source link

Duplicate podcasts in response #19

Closed jennie-metacast closed 5 months ago

jennie-metacast commented 6 months ago

Description of bug

Hi, thank you for providing this package! It is very handy. I have noticed that in some cases, there is a duplication of podcasts in the response from podcast_search which is not present in the iTunes API response.

Example with podcast_search

Request:

Search().search("Startup Therapy")

Response: Screenshot 2024-03-07 at 2 57 28 PM

Duplicates (with different collectionId and trackId): Screenshot 2024-03-07 at 2 59 57 PM

Example with curl & iTunes API

Request:

curl -X GET \
  "https://itunes.apple.com/search?entity=podcast&term=Startup%20Therapy"

Response:

{
  "resultCount": 3,
  "results": [
    {
      "wrapperType": "track",
      "kind": "podcast",
      "collectionId": 1450325643,
      "trackId": 1450325643,
      "artistName": "Startups.com",
      "collectionName": "Startup Therapy",
      "trackName": "Startup Therapy",
      "collectionCensoredName": "Startup Therapy",
      "trackCensoredName": "Startup Therapy",
      "collectionViewUrl": "https://podcasts.apple.com/us/podcast/startup-therapy/id1450325643?uo=4",
      "feedUrl": "https://feeds.transistor.fm/startup-therapy",
      "trackViewUrl": "https://podcasts.apple.com/us/podcast/startup-therapy/id1450325643?uo=4",
      "artworkUrl30": "https://is1-ssl.mzstatic.com/image/thumb/Podcasts115/v4/f5/ea/ec/f5eaec73-76fb-f051-2b9e-6ac3f32ad2df/mza_12740194947879630385.jpg/30x30bb.jpg",
      "artworkUrl60": "https://is1-ssl.mzstatic.com/image/thumb/Podcasts115/v4/f5/ea/ec/f5eaec73-76fb-f051-2b9e-6ac3f32ad2df/mza_12740194947879630385.jpg/60x60bb.jpg",
      "artworkUrl100": "https://is1-ssl.mzstatic.com/image/thumb/Podcasts115/v4/f5/ea/ec/f5eaec73-76fb-f051-2b9e-6ac3f32ad2df/mza_12740194947879630385.jpg/100x100bb.jpg",
      "collectionPrice": 0.0,
      "trackPrice": 0.0,
      "collectionHdPrice": 0,
      "releaseDate": "2024-03-04T10:00:00Z",
      "collectionExplicitness": "notExplicit",
      "trackExplicitness": "cleaned",
      "trackCount": 248,
      "trackTimeMillis": 1912,
      "country": "USA",
      "currency": "USD",
      "primaryGenreName": "Business",
      "contentAdvisoryRating": "Clean",
      "artworkUrl600": "https://is1-ssl.mzstatic.com/image/thumb/Podcasts115/v4/f5/ea/ec/f5eaec73-76fb-f051-2b9e-6ac3f32ad2df/mza_12740194947879630385.jpg/600x600bb.jpg",
      "genreIds": ["1321", "26"],
      "genres": ["Business", "Podcasts"]
    },
    {
      "wrapperType": "track",
      "kind": "podcast",
      "collectionId": 1049614170,
      "trackId": 1049614170,
      "artistName": "Kyle Meades",
      "collectionName": "Speech Therapy Private Practice Startup Podcast",
      "trackName": "Speech Therapy Private Practice Startup Podcast",
      "collectionCensoredName": "Speech Therapy Private Practice Startup Podcast",
      "trackCensoredName": "Speech Therapy Private Practice Startup Podcast",
      "collectionViewUrl": "https://podcasts.apple.com/us/podcast/speech-therapy-private-practice-startup-podcast/id1049614170?uo=4",
      "feedUrl": "https://www.privateslp.com/feed/podcast/",
      "trackViewUrl": "https://podcasts.apple.com/us/podcast/speech-therapy-private-practice-startup-podcast/id1049614170?uo=4",
      "artworkUrl30": "https://is1-ssl.mzstatic.com/image/thumb/Podcasts115/v4/3b/68/6e/3b686ede-640e-b4c5-414d-1a518743f1c6/mza_11185660219265413148.jpg/30x30bb.jpg",
      "artworkUrl60": "https://is1-ssl.mzstatic.com/image/thumb/Podcasts115/v4/3b/68/6e/3b686ede-640e-b4c5-414d-1a518743f1c6/mza_11185660219265413148.jpg/60x60bb.jpg",
      "artworkUrl100": "https://is1-ssl.mzstatic.com/image/thumb/Podcasts115/v4/3b/68/6e/3b686ede-640e-b4c5-414d-1a518743f1c6/mza_11185660219265413148.jpg/100x100bb.jpg",
      "collectionPrice": 0.0,
      "trackPrice": 0.0,
      "collectionHdPrice": 0,
      "releaseDate": "2020-06-14T17:47:00Z",
      "collectionExplicitness": "notExplicit",
      "trackExplicitness": "cleaned",
      "trackCount": 50,
      "trackTimeMillis": 2292,
      "country": "USA",
      "currency": "USD",
      "primaryGenreName": "Business",
      "contentAdvisoryRating": "Clean",
      "artworkUrl600": "https://is1-ssl.mzstatic.com/image/thumb/Podcasts115/v4/3b/68/6e/3b686ede-640e-b4c5-414d-1a518743f1c6/mza_11185660219265413148.jpg/600x600bb.jpg",
      "genreIds": ["1321", "26", "1304", "1500"],
      "genres": ["Business", "Podcasts", "Education", "Self-Improvement"]
    },
    {
      "wrapperType": "track",
      "kind": "podcast",
      "collectionId": 1491615628,
      "trackId": 1491615628,
      "artistName": "Startups.com",
      "collectionName": "Startup Therapy",
      "trackName": "Startup Therapy",
      "collectionCensoredName": "Startup Therapy",
      "trackCensoredName": "Startup Therapy",
      "collectionViewUrl": "https://podcasts.apple.com/us/podcast/startup-therapy/id1491615628?uo=4",
      "feedUrl": "https://feeds.transistor.fm/startup-therapy",
      "trackViewUrl": "https://podcasts.apple.com/us/podcast/startup-therapy/id1491615628?uo=4",
      "artworkUrl30": "https://is1-ssl.mzstatic.com/image/thumb/Podcasts113/v4/86/7b/2e/867b2eb1-be68-6db2-b77b-4b16e8971297/mza_17708346996430747403.jpg/30x30bb.jpg",
      "artworkUrl60": "https://is1-ssl.mzstatic.com/image/thumb/Podcasts113/v4/86/7b/2e/867b2eb1-be68-6db2-b77b-4b16e8971297/mza_17708346996430747403.jpg/60x60bb.jpg",
      "artworkUrl100": "https://is1-ssl.mzstatic.com/image/thumb/Podcasts113/v4/86/7b/2e/867b2eb1-be68-6db2-b77b-4b16e8971297/mza_17708346996430747403.jpg/100x100bb.jpg",
      "collectionPrice": 0.0,
      "trackPrice": 0.0,
      "collectionHdPrice": 0,
      "releaseDate": "2024-03-04T10:00:00Z",
      "collectionExplicitness": "notExplicit",
      "trackExplicitness": "cleaned",
      "trackCount": 248,
      "trackTimeMillis": 1912,
      "country": "USA",
      "currency": "USD",
      "primaryGenreName": "Business",
      "contentAdvisoryRating": "Clean",
      "artworkUrl600": "https://is1-ssl.mzstatic.com/image/thumb/Podcasts113/v4/86/7b/2e/867b2eb1-be68-6db2-b77b-4b16e8971297/mza_17708346996430747403.jpg/600x600bb.jpg",
      "genreIds": ["1321", "26"],
      "genres": ["Business", "Podcasts"]
    }
  ]
}
amugofjava commented 6 months ago

Hi @jennie-metacast,

Thanks for letting me know about this. It's a bit odd this one.

The URL podcast_search generates is:

https://itunes.apple.com/search?term=Startup%20Therapy&explicit=No&media=podcast&entity=podcast

Running curl with the above URL does return 4 results, with Startup Therapy appearing twice with different collection IDs; however, if you search for Startup Therapy using Apple Podcasts it does indeed appear twice in the results:

PNG image

Removing media=podcast from the query (thereby defaulting to all) removes the duplicate from the results, but I'm not sure why as both collections are "kind": "podcast"

I'll reach out to this podcast and ask if they know why they appear twice on Apple Podcasts.

amugofjava commented 5 months ago

I did reach out to the hosts of this podcast, but I have not heard back. I wondered that, as they have been on the Metacast podcast, you might have a better route to contact them? I would be interested to know what they say.

As this is a problem with duplicates within Apple Connect, rather than podcast_search I will close this issue.