markjfine / nrsc5-dui

An enhanced, user-friendly version of nrsc5-gui that is not heavily dependent upon Python processing for audio generation.
GNU General Public License v3.0
139 stars 9 forks source link

Testing Notes 2 #6

Closed andrewfer000 closed 3 years ago

andrewfer000 commented 3 years ago

This is a new issue to start fresh from Testing Notes.

andrewfer000 commented 3 years ago

To me it's just a little wide.

markjfine commented 3 years ago

Re-thinking my idea for the Album Art redesign, because some of it doesn't really make sense.

You really can't put the callsign, slogan, or motto with the station logo/cover art because they're static across all of the streams. For example, it would look real silly having the all that info for WISX surrounding the logo for Smooth Jazz JJZ (HD2).

What might make more sense is simply to just add line segment containing the stream name and a real small version of the logo above the cover art, but I'm not sure it's even worth the real estate. As long as you can load your own logos, you don't have all the dead space, which is fine.

andrewfer000 commented 3 years ago
 # If no match use the station logo if there is one
                if (not imgSaved):
                    #print("No image found, using logo")
                    self.coverImage = os.path.join(aasDir, self.stationLogos[self.stationStr][self.streamNum])
                    self.streamInfo['Album']=""

I think this should be changed to set the station sent album artwork if MB fails instead of just using the station logo. I think the best way to set this to to get the newest downloaded artwork from the station and set that.

Also I made some modifications to the MusicBrainz stuff. If I notice better results I'll put them here since I have no idea how to make a diff or a pull request xD

markjfine commented 3 years ago

The problem with that is you have no guarantee that cover art exists at that point, or is sent by the station. Sometimes it's sent before the song starts, sometimes it isn't. If it did, it would override this anyway if the new option is selected. This was meant as a stopgap to make sure something was showing that wasn't the previous cover art.

markjfine commented 3 years ago

What you might want to do after you make your changes, is to test it out on various stations that provide track info, but may or may not have logos or cover art, and timing of the XHDR messages which switch the covers and logos in and out. You'll see that what's there tries to create a happy medium in all those cases, because not every station operates the same way.

You'll also quickly find out that the results from MB are not very consistent and very dependent upon how things are spelled. So what may work in the last hour may not work the same way in the next hour.

markjfine commented 3 years ago

Haven't posted it yet, but have a working prototype metadata database for the cover images. When it saves a cover, it now saves a record that includes the title, artist, album, and genre (for when that starts getting populated). If the cover already exists, it pulls the album and genre from the database (it already has the title and artist). It saves the database on exit, and reads it in on startup.

Seems to work, but will wait a bit until I post it. One snag is the possibility of having previously downloaded images that aren't in the database yet. If that happens, click stop, delete the file, then click start so it downloads a new one and saves the metadata.

markjfine commented 3 years ago

Afterthought: You can easily edit the genre manually in the json if you had to.

markjfine commented 3 years ago

Found a good example where spelling is important: Kali Uchis - Telepatia won't match in MusicBrainz because they have an accent in the i in Telepatìa, but the station doesn't.

StupidThingsILearnedDoingDataFusionInThe90s

Thinking aloud: Maybe there's a way you rectify that by downloading and setting the metadata manually (if it's that important). They do have a search facility. Maybe I set a button to spawn a web page with the manual query, you pick the one you want and import it kinda like the way I'm doing station logos, but saving the additional metadata as well.

markjfine commented 3 years ago

Leaving this for reference here. The search via website takes the form: https://musicbrainz.org/taglookup/index?tag-lookup.artist=<artist>&tag-lookup.track=<title> replace spaces with +

andrewfer000 commented 3 years ago

While debugging, I noticed that sometimes I get this output in console where it finds a bunch of releases for a song but it does not apply the Album name or artwork for the song. This happens on both my testing code and the official code. I can't seem to think of a cause for this. Also, the album name only appears when the artwork is downloaded for the first time, but based on what I read here you are already working on that. I'm going to see what I can do about this issue here.

recording search succeeded
got recording search result with 25 recordings
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 2 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 2 releases
got release list with 1 releases
got release list with 4 releases
got release list with 1 releases
got release list with 2 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 3 releases
got release list with 2 releases
got release list with 1 releases

or

searching for Semisonic - Closing Time
recording search succeeded
got recording search result with 25 recordings
got release list with 1 releases
got release list with 1 releases
got release list with 3 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 3 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases

For reference, here is an example of a successful artwork download -

searching for Foo Fighters - Everlong
recording search succeeded
got recording search result with 18 recordings
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 2 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 1 releases
got release list with 4 releases
got release list with 1 releases
got release list with 2 releases
got release list with 39 releases
MusicBrainz image list retrieval error for id 560e3237-9cc5-488f-913c-a5f57050becb
Found Official: Track: Foo Fighters - Everlong, Album: Foo Fighters - The Colour and the Shape, c247e26f-7303-4046-9824-2bc6fda17ac2 99% 

Also, some things clearly exist in the database, for example

searching for Michigander - Better
recording search succeeded
got recording search result with 0 recordings

But it does have an entry in the database here -> https://musicbrainz.org/release/145bf06e-0f9b-4dcc-8f17-ab9609766a72/disc/1#63692819-629b-4259-ad24-71e6c6a27ccd

markjfine commented 3 years ago

I'm using specific parameters because what gets returned is not always the right thing.

For example, some of those releases may have been other than Album, Single, or EP. Making matters worse, some Compilations are coming back as Albums. The way I detect condition is by looking at the Album Artist (resultArtist2) and seeing if it includes 'Various' (as in 'Various Artists'). There's also a check to see if it's a Bootleg, so I check to make sure it's listed as 'Official'.

Also, the default number of recordings is 25. It couldn't find anything that fits the criteria in the first 25 recordings. There's something that can be added to the initial query to bring that number up to about 40, however I've found that if you generally can't get something that fits in the first 25, you're also not going to get anything in 40.

Plus, just because there's a valid release that fits my criteria, it doesn't mean there's 'Approved' 'Front' artwork, or any artwork at all. That's why I have that additional check in-line to the release check, so it can go through all of the releases until it finds valid artwork.

You'll also notice that sometimes the query (like image-list) fails. Generally that means there's no meta data for it. There's a code that gets returned but at that point their reason for the error really doesn't matter, so I just ignore it.

If there were a way to do this using a SQL I could create a join on the fly and get to the answer quicker, and with a whole lot less internet interaction, which is also a concern. But we're stuck doing individual json queries, not what they have on the website, which I'm sure is a whole lot more efficient.

I came to the conclusion a couple of days ago that there's no perfect answer to this given the tools we have available. You're welcome to continue to dig into it from a learning standpoint, but it's so complicated and erratic that I caution it may start to drive you crazy (like it almost did to me).

I just simply accepted that for now the way it's currently doing it is probably the best it's going to get for a consistently accurate answer, with zero false positives. And I used several sources for that a hits station, oldies station, country station, hip-hop station, as well as a christian music station. I wish I had a station to test with that at least had the music I listen to. Instead I got everything else. lol

andrewfer000 commented 3 years ago

It looks like it's as good as it's going to get for now. I know your philosophy is different where you want things to work as perfect as possible and I am more of a something is better than nothing person. When a song is playing, i'd rather have album artwork weather it be the official album (preferred) or an EP, Best Hits, or other compilation as the second (fallback) option.

markjfine commented 3 years ago

I understand. Maybe I can set another option in the interface where it sets Strict to False. You can try taking out the Strict=True part in the initial query to see if that's better for you. If it does, I'll add the gui option.

markjfine commented 3 years ago

Incidentally, I found out what's driving the display to be so wide... the spinners and entry box in Settings. There's no reason for it to be that wide when the numbers are only 1 or 2 digits in the spinners. Perhaps the FILL settings need to be dropped for that column of widgets completely, and that will fix it.

andrewfer000 commented 3 years ago

I tried that. It worked for some but then it breaks what worked before. It's so confusing. Maybe there could be a way where if and only if the Strict version does return a result it will try to run a less strict version. If the less strict fails, it will fall through like usual.

markjfine commented 3 years ago

Might take a look at that, though I'm concerned with the amount of internet hits. Eventually MusicBrainz will throttle the app based on excessive usage. Not so concerned about the recording and release queries, but there could potentially be a lot of image-list queries, as you've seen. Not a big deal on low-latency high-speed internet either, but given any lag time it will add up and make things very slow.

And I agree about it being confusing. Part of that is that the structures aren't guaranteed to include certain things, and the data itself isn't consistent. For example, the genre info is contained in the 'tags-list' - but it seems very few records are tagged or even have 'tags-list'. Then there's the duplication: The definition between what is a recording and what is a release is very fuzzy. To me a recording is a physical recording, which may or may not have several variations (releases) based on country, etc. So whoever designed this schema likely never made it very clear to the people populating it. I don't even want to go into the quality control issues that I discovered by looking at the raw data, which is whole 'nother ball of wax.

andrewfer000 commented 3 years ago

It's not your fault it has to make so many internet hits, it's just how the API works. You gotta remember that this is just a "Wikipedia for Music" not everything is going to be the way it should be. You, nor I can make MusicBrainz perfect.

markjfine commented 3 years ago

"It is what it is". I added the Strict option, set to On as default, provided the DL option is set. So, we'll see. Can never have too many levers to pull.

Am going to start posting my latest, since it looks like the cached album titles work... Just remember that it won't cache if you have existing album art, so you may want to move them off to the side if you want to keep them.

markjfine commented 3 years ago

done.

andrewfer000 commented 3 years ago

Is there a way to detect if an album art sets or not? I wonder if there is a way to implement a system where it starts with the most strict queries and it that fails (on the occasions that it does) it goes to a less strict query (I found the best thing to do is take away the type=Album requirement) but I only want that to happen if the first query does not set an album art.

markjfine commented 3 years ago

If nothing is downloaded, imgSaved is False. You could take the whole thing from after try: imgSaved=False to just before # If no match use the station logo

Move that into a def block with the appropriate arguments and returning things like imgSaved, the ['Album'] and ['Genre'] and just call that twice: once with Strict set, then if it returns False, run it again with Strict set to False.

I would think you'd want to keep the type=Album requirement, but setting Strict=False makes the priority on title or artist rather than type anyway. At least that's what I've found.

markjfine commented 3 years ago

If you want to emulate that, run it with the Strict engaged. Wait until something comes up blank. Click Stop, change the Strict setting, then hit Play. That will trigger the search again when the track info updates.

markjfine commented 3 years ago

Added a new TODO 4.f for a more manual approach to this. Would prefer a drag/drop thing for the image URL, but that might take some doing.

andrewfer000 commented 3 years ago

After this update, things seem to be working better. I'll leave it alone for now but if I notice problems I'll start messing around with it again.

andrewfer000 commented 3 years ago

If you want to emulate that, run it with the Strict engaged. Wait until something comes up blank. Click Stop, change the Strict setting, then hit Play. That will trigger the search again when the track info updates.

I tried this and it works for more songs than you may think. I feel like the method of going from the strict to non-strict quarries should be a part of the code since it is a good catch. If you don't want to implement it I'll try and do it but it wont be pretty and you'll probably have to fix it up but hey. I'll try. The main issue with the current implantation is that when strict completely is disabled, it breaks more than fix for most songs.

My end goal with this is to have every song have an album art from the database if possible even if its just from an iTunes special or compilation yet ensure that the song gets the best and most accurate artwork available.

markjfine commented 3 years ago

Let me see what I can do

markjfine commented 3 years ago

Changed routine to run with Strict off if it first runs with it on and doesn't find anything. I've left the print statements uncommented so you can see what's going on underneath the hood.

Edit: Also noted that it was adding the Album data, but not the Genre (even if it came up blank), so I added that just in case.

andrewfer000 commented 3 years ago

I made some modifications to the code to add more filters, I am testing now and if all goes well I'll share it here. Also, I am having some issues with station art,

  1. If station art is downloaded from the internet, it will still default to the one the station set
  2. MusicBrainz searches occur every time the Title and Artist switches back to the station info, The best way to fix this is to set up downloaded logos like artwork, as in using the Artist_-_Title format for the logo file.
markjfine commented 3 years ago
  1. Yeah, this was really meant for stations that don't have logos. As you've seen, trying to roll your own logo could have implications.

  2. Actually, that could cause a problem because you won't be able to differentiate between logos and cover art, and I'm in the process of trying to separate the two.

Another issue is that not every station provides the same thing. Some run ads and the title/artist changes the text to match the ad.

Probably a better way to keep it from doing a search is to use the XHDR flag, and only do searches when lastXHDR is set for a cover, not a logo.

Having to take care of a few non-coding things atm, so may not get to adding this until a bit later.

andrewfer000 commented 3 years ago

Sounds good take as long as you need. I never ran into a station that changes text for ads but I wouldn't be surprised if some stations did that. I am just trying to find solutions based on what I know so you probably know a better solution.

As for my enhanced filter, it works well and is able to catch pretty much everything but I'm still testing it and like I said, I never used git before so I'll just post my code here when it's ready.

markjfine commented 3 years ago

Of course this XHDR thing only works if the station cooperates and sends the correct messages. If they change the text but XHDR doesn't change to "0", they're basically saying that they want the logo to show, not a cover.

andrewfer000 commented 3 years ago

Starting at around line 434 and the "If not, get it from MusicBrainz" comment. You'll understand. If you have any issues I'll share my whole nrsc5-dui.py file. Overall the filters work, I've gotten most things with the 1st filter but I got some others with filters 2,5, 6, and 7 as well.

 # if not, get it from MusicBrainz
        else:
            try:
                imgSaved = False
                i = 1

                while (not imgSaved):
                    print()
                    print("searching for {} - {}".format(searchArtist,newTitle))

                    try:
                      if (i==1 and (not imgSaved)): 
                        setStrict = True
                        result = musicbrainzngs.search_recordings(strict=setStrict, artist=searchArtist, recording=newTitle, type='Album', status='Official')
                        print("Running through filter 1")
                      if (i==2 and (not imgSaved)):
                        setStrict = False
                        result = musicbrainzngs.search_recordings(strict=setStrict, artist=searchArtist, recording=newTitle, type='Album', status='Official')
                        print("Running through filter 2")
                      if (i==3 and (not imgSaved)):
                        setStrict = True
                        result = musicbrainzngs.search_recordings(strict=setStrict, artist=searchArtist, recording=newTitle, status='Official')
                        print("Running through filter 3")
                      if (i==4 and (not imgSaved)):
                        setStrict = False
                        result = musicbrainzngs.search_recordings(strict=setStrict, artist=searchArtist, recording=newTitle, status='Official')
                        print("Running through filter 4")
                      if (i==5 and (not imgSaved)):
                        setStrict = True
                        result = musicbrainzngs.search_recordings(strict=setStrict, artist=searchArtist, recording=newTitle, type='Album')
                        print("Running through filter 5")
                      if (i==6 and (not imgSaved)):
                        setStrict = False
                        result = musicbrainzngs.search_recordings(strict=setStrict, artist=searchArtist, recording=newTitle, type='Album')
                        print("Running through filter 6")
                      if (i==7 and (not imgSaved)):
                        setStrict = True
                        result = musicbrainzngs.search_recordings(strict=setStrict, artist=searchArtist, recording=newTitle)
                        print("Running through filter 7")
                      if (i==8 and (not imgSaved)):
                        setStrict = False
                        result = musicbrainzngs.search_recordings(strict=setStrict, artist=searchArtist, recording=newTitle)
                        print("Running through filter 8")

                    except:
                        print("MusicBrainz recording search error")
                        #pass

                    if (result is not None) and ('recording-list' in result) and (len(result['recording-list']) != 0):
                        print("got recording search result with {} recordings".format(len(result['recording-list'])))

                        # loop through the list until you get a match
                        for (idx, release) in enumerate(result['recording-list']):
                            #print(release)
                            resultID = self.check_value('id',release,"")
                            resultScore = self.check_value('ext:score',release,"0")
                            resultArtist = self.check_value('artist-credit-phrase',release,"")
                            resultTitle = self.check_value('title',release,"")
                            resultGenre = self.check_value('name',self.check_value('tag-list',release,""),"")
                            scoreMatch = (int(resultScore) > 90)
                            artistMatch = (newArtist.lower() in resultArtist.lower())
                            titleMatch = (newTitle.lower() in resultTitle.lower())
                            recordingMatch = (artistMatch and titleMatch and scoreMatch)

                            # don't bother dealing with releases if artist, title and score don't match
                            resultStatus = ""
                            resultType = ""
                            resultAlbum = ""
                            resultArtist2 = ""
                            releaseMatch = False
                            imageMatch = False
                            if recordingMatch and ('release-list' in release):
                                print("got release list with {} releases".format(len(release['release-list'])))
                                for (idx2, release2) in enumerate(release['release-list']):
                                    #print(release2)
                                    imageMatch = False
                                    resultID = self.check_value('id',release2,"")
                                    resultStatus = self.check_value('status',release2,"Official")
                                    resultType = self.check_value('type',self.check_value('release-group',release2,""),"")
                                    resultAlbum = self.check_value('title',release2,"")
                                    resultArtist2 = self.check_value('artist-credit-phrase',release2,"")
                                    typeMatch = (resultType in ['Single','Album','EP'])
                                    statusMatch = (resultStatus == 'Official')
                                    albumMatch = (not self.check_terms(resultAlbum, albumExclude))
                                    #artistMatch2 = (resultArtist2 != "") and (not ('Various' in resultArtist2))
                                    artistMatch2 = (not ('Various' in resultArtist2))
                                    releaseMatch = (artistMatch2 and albumMatch and typeMatch and statusMatch)
                                    print("#{} {}: Track: {} - {}, {}: {} - {}, {} {}% {}".format(idx, resultStatus, resultArtist, resultTitle, resultType, resultArtist2, resultAlbum, resultID, resultScore, resultGenre))                    
                                    # don't bother checking for covers unless album, type, and status match
                                    if releaseMatch:
                                        imageMatch = self.check_musicbrainz_cover(resultID)
                                    if (releaseMatch and imageMatch and ((idx2+1) < len(release['release-list']))):
                                        break

                            if (recordingMatch and releaseMatch and imageMatch):

                                # got a full match, now get the cover art
                                print("Found {}: Track: {} - {}, {}: {} - {}, {} {}% {}".format(resultStatus, resultArtist, resultTitle, resultType, resultArtist2, resultAlbum, resultID, resultScore, resultGenre))
                                if self.save_musicbrainz_cover(resultID,saveStr):
                                    self.coverImage = saveStr
                                    imgSaved = True
                                    if (self.streamInfo['Album'] == ""):
                                        self.streamInfo['Album']=resultAlbum
                                    if (self.streamInfo['Genre'] == ""):
                                        self.streamInfo['Genre']=resultGenre
                                    self.coverMetas[baseStr] = [self.streamInfo["Title"],self.streamInfo["Artist"],self.streamInfo["Album"],self.streamInfo["Genre"]]
                                print("recording search succeeded")

                            if (imgSaved) and ((idx+1) < len(result['recording-list'])) or (not scoreMatch):
                               break

                    i = i + 1
                    # if we got an image or all the filters were run through, there is no reason to keep looking.
                    if (imgSaved or i > 9):
                        break
markjfine commented 3 years ago

Don't mean to impede your initiative (which is commendable), but I'm concerned that if you have a number of people running this concurrently with that many potential queries, aside from a performance hit for people with slow, high-lag internet connections, the volume of hits could either get the app severely throttled or banned quicker than Discogs. They were pretty adamant on the site about that part.

So I have to ask if you've profiled this as far as a worst case turnaround time. I noticed quite a bit of delay just using two queries but it's not really noticeable when the print statements are commented out. I'm curious what 8 does before I put it in. Also, how many hits are we potentially pinging MB with for one cover?

andrewfer000 commented 3 years ago

8 does the same thing is 7 but the strict is turned off. The number of hits is what I am worried about too. Especially since it happens every time a song ends and it switches back to the station's info. When it comes to songs however, there usually only needs to be the one query unless it's not in the first one. As long as MB has it and it is correct in the DB (and the radio station spelled everything right) then usually it won't have to go beyond filter 4 or filter 6.

Also it takes less than 10 seconds to go through all of these.

andrewfer000 commented 3 years ago

I was reading their limit rules. I am kinda worried about the user-agent limits, but the IP limits will never be reached with a program like this since it's not like we are running a cloud-based service with many users. Each user has their own copy of nrsc5-dui running on their system on their own IP address. I mean we don't want to or mean to abuse their service or anything.

markjfine commented 3 years ago

10 seconds is kinda long, but we'll see.

The order they're run is really the priority. So according to your results, what I might do is swap the order of the 'Album'-only ones with the 'Official'-only ones so it has the potential to exit quicker. I may also combine them down to 4 lines and just flip-flop setStrict back and forth, much the way the current one uses one query line - just in case you wonder why it might look different.

andrewfer000 commented 3 years ago

Alright give it a try and see what happens.

andrewfer000 commented 3 years ago

I know this might sound kinda wrong, but is there a way to have a random user-Agent string generated on startup so everyone has their own limits? Maybe make it something like "nrsc5dui radomstring" where the randomstring is generated and saved on first run?

markjfine commented 3 years ago

I see malicious bots that use valid UA strings and fake googlebots all the time. Would really rather not game the system that way. I block those suckers straight away. lol

andrewfer000 commented 3 years ago

But hey those are malicious bots, this program is good. But tbh I don't even think with a 100 users this program will break the limits just as long everything is optimized and we try to minimize pointless hits.

markjfine commented 3 years ago

a wise man once told me to dress for the job you want. if we look like a bot, we're a bot. anyway... Arsenal look like crap. I may start on this just for the distraction.

andrewfer000 commented 3 years ago

Also, don't forget the limits are based on averages and it's per second and this program makes 1 to 8 quick requests on average once every 120-180 seconds (if it's new data only) there really isn't too much to worry about. My main concern is that if there is nothing available it will keep trying which is only a bad thing when the song info changes back to the station info.

andrewfer000 commented 3 years ago

Also this Testing Notes is becoming a bit big, I think it might be time to close this one and open another one. Maybe we should open an issue just for MB Stuff.

markjfine commented 3 years ago

XHDR is solving the logo issue. Only problem is that if the station doesn't remember to switch it back to 0, it will stay on the logo even though the text changes. May be a minor consolation.