Closed schnillerman closed 2 years ago
A full wipe cache & scan would do without moving files, wouldn't it?
I just tried this - now all my favorites are gone. :(
And yes, it fixes duplicate entries, but it usually takes longer (2,5h) than 2x rescan (7 minutes per re-scan for 215.000 titles): Just the database deletion takes as long as 1 rescan.
And it would seem weird to me if multiple library entries exist (within the same library, of course) that refer to one and the same file. Maybe a library consistency check that takes care of duplicate entries for same file would be helpful.
I would love to see this one fixed, it's been a long standing problem I noticed too. I use the same workaround (renaming the directory) - and a full rescan is not really a practical option for those of us with large libraries.
Feels like there ought to be an easy solution in the scanner - as @schnillerman suggests, a consistency check or some such
Could one of you please outline how that easy consistency check would work?
Hi Michael,
totally understand your question :)
Me as a very inexperienced programmer (if any), I would probably check for duplicate entries in the table where the full path/filename are stored.
If one and the same file is listed more than once, there's a good indicator that it's registered as a duplicate.
Cheers, Till
What seems to be happening is - when you change a file within an album somehow, or add a new file to the album - then do a new/changed scan:
The scan finds the new file and creates it within a new album, so you end up with two duplicated albums
1 - the original album with the unchanged files but not the changed/new 2 - the new album with just the changed/new file and none of the unchanged files
So the logic needs to be something like
You're right, I remember now that a new album is mostly created in this case.
From: bobbydriver @.> To: Logitech/slimserver @.> CC: schnillerman @.>; Mention @.> Date: 27.01.2022 18:36:05 Subject: Re: [Logitech/slimserver] Duplicate title entries after library update (#547)
What seems to be happening is - when you change a file within an album somehow, or add a new file to the album - then do a new/changed scan:
The scan finds the new file and creates it within a new album, so you end up with two duplicated albums
1 - the original album with the unchanged files but not the changed/new 2 - the new album with just the changed/new file and none of the unchanged files
So the logic needs to be something like
new file is found
read album tag
read the folder path
does an album with the same name exist with the same folder path?
if yes then add the file to the existing album
if no - carry on as before and create a new album
— Reply to this email directly, view it on GitHub[https://github.com/Logitech/slimserver/issues/547#issuecomment-1023476218], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AEDUA6JIT37IURSRI7QC5W3UYF7AJANCNFSM4YZFHB3Q]. Triage notifications on the go with GitHub Mobile for iOS[https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675] or Android[https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub]. You are receiving this because you were mentioned. [data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAD8AAAA/CAYAAABXXxDfAAAAAXNSR0IArs4c6QAAAARzQklUCAgICHwIZIgAAAAmSURBVGiB7cEBDQAAAMKg909tDwcUAAAAAAAAAAAAAAAAAAAAJwY+QwABivJx1AAAAABJRU5ErkJggg==###24x24:true###][Verfolgungsbild][https://github.com/notifications/beacon/AEDUA6NOWWWS2ULCWTIQ37DUYF7AJA5CNFSM4YZFHB32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOHUAQD6Q.gif]
Now here's the problem: the reason why a regular scan is so much faster than a full wipe & rescan is because the former only deals with changed items, doesn't do these kinds of optimisations and checks. Any additional check will slow it down.
In order to keep things as fast as possible, we have to be sure what we're talking about. The issue subject line says "Duplicate title entries". The description says "duplicates (in the same album)". And the latest suggestion is about duplicated albums. Maybe both are valid. And I'm pretty sure complaints about genres have been heard, too...
I fear in order to fix this all I'd need the amount of time I currently don't have.
Oh, and artists: #704
Thanks Michael - appreciate that it's probably a lot of effort. If I get chance I might set up a test rig and do some proper documentation of the issues/scenarios. I don't know perl so I couldn't do anything with the scanner, but i could at least work out the sql queries that ID the culprits
As for the scan time - I had the exact same thought. It would really need to be a separate scan option for occasional use. A "tidy/remove duplicates scan" or something
In actual fact I'm more than happy with all the LMS functionality these days and the only thing left which bugs me is the way the new/changed scan can make a mess of db integrity. I'd actually really love a UI that allowed me to query and tidy up my music db without the inconvenience of a full drop and rescan, but I know that's dreamland :)
Could both of you please describe what tag you'd change (artist, album, title...), and what the outcome would be? I think I've identified one issue if you changed some tracks' artist names without getting rid of the original artist name (eg. different artists of the same name, you rename only one of them). This could likely cause empty albums in the original artist's collection (see #704).
Would #705 be a duplicate of this issue?
Those affected by the file renaming issue: what OS are you using?
Happens when I change attributes like
If the file name upper/lower case is changed, it happens as well.
I'm running LMS on a Linux Debian.
I think I've identified the cause of the duplication in case of a file name case change. See https://github.com/Logitech/slimserver/issues/705#issuecomment-1026229542. There's some background information, and how you might be able to work around / fix this until I have a fix in LMS.
Could you please give the 8.3 nightly a try (https://downloads.slimdevices.com/nightly/?ver=8.3)? I applied a few changes to the scanner. I'm no longer able to get invalid records after
I just installed 8.3 over 8.2 and will have a look!
Do I need to perform a complete re-scan?
I loaded the nightly and did the same tests - works ok for me too (on Raspbian 10 Buster/Max2Play)
The duplicate albums still get created though if you fundamentally change a filename (other than a case change) - or add new files to the album folder - then run a new/changed rescan. Does that need to be raised as a separate issue to keep things clear?
The duplicate albums still get created though if you fundamentally change a filename (other than a case change) - or add new files to the album folder - then run a new/changed rescan. Does that need to be raised as a separate issue to keep things clear?
Thank you so much for mentioning this behavior - I forgot that this happens to me a lot, too, because I've been working around this by temp_renaming the updated folder, scanning, re-naming again, re-scanning!
Just did a test added some new files to an existing album folder Essentially the scan is picking up the new files by timestamp, and creating them as a new album - not recognising that the album already exists and that they should be added to the existing album
I realise that adding this integrity step to a new/changed files rescan will slow things down, but maybe not too much? After all - it only needs to be run against the new files discovered
If you put the cover art in each album folder, then the SQL to id the existing duplicates is quite simple - because although it allocates a new album id to the new files - the value for cover (which is essentially the path to the cover.jpg) is the same for both the new and existing albums
SELECT distinct album, cover
FROM tracks
WHERE cover IN (
SELECT cover
FROM tracks
GROUP BY cover
HAVING COUNT(distinct album) > 1
)
Not sure how this works for people who use embedded cover art though
OK - just been digging some more and that SQL is not ideal, as it also finds occurrences where you have files in the same album folder but with different album tags. That's just bad tagging/mistakes, so handy for IDing where your library is messed up, but not a definitive ID of where the new/changed scan problem has happened
I also worked out this SQL query on the albums table
Select A.title,A.id, C.name, b.artwork
from albums A, contributors C
join albums B
on A.title = B.title
and A.contributor=C.id
and A.contributor= B.contributor
and A.year = B.year
and A.artwork <> B.artwork
group by A.title,A.artwork
this IDs where a duplicate album name has the same artist and year BUT a different coverart hash - which also pulls out records where the new/changed scan problem has happened BUT also IDs other issues, like where you have moved a file to a different album but not changed the album tag, or where the album tag within the folder is actually different - so again bad tagging
Neither of these queries take bad tagging into account, so only useful for manually interrogating libraries for bad integrity - not the new/changed scan problem
OT - is your nick from Bob's Burgers? 😂
From: bobbydriver @.> To: Logitech/slimserver @.> CC: schnillerman @.>; Mention @.> Date: 01.02.2022 16:39:41 Subject: Re: [Logitech/slimserver] Duplicate title entries after library update (#547)
OK - just been digging some more and that SQL is not ideal, as it also finds occurrences where you have files in the same album folder but with different album tags. That's just bad tagging/mistakes, so handy for IDing where your library is messed up, but not a definitive ID of where the new/changed scan problem has happened
I also worked out this SQL query on the albums table
*Select A.title,A.id, C.name, b.artwork from albums A, contributors C join albums B on A.title = B.title and A.contributor=C.id and A.contributor= B.contributor and A.year = B.year and A.artwork <> B.artwork group by A.title,A.artwork
* this IDs where a duplicate album name has the same artist and year BUT a different coverart hash - which also pulls out records where the new/changed scan problem has happened BUT also IDs other issues, like where you have moved a file to a different album but not changed the album tag - so again bad tagging
Neither of these queries take bad tagging into account, so only useful for manually interrogating libraries for bad integrity - not the new/changed scan problem
— Reply to this email directly, view it on GitHub[https://github.com/Logitech/slimserver/issues/547#issuecomment-1026981912], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AEDUA6J4AOZ6AX72P4A44RDUY75D3ANCNFSM4YZFHB3Q]. Triage notifications on the go with GitHub Mobile for iOS[https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675] or Android[https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub]. You are receiving this because you were mentioned. [data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAD8AAAA/CAYAAABXXxDfAAAAAXNSR0IArs4c6QAAAARzQklUCAgICHwIZIgAAAAmSURBVGiB7cEBDQAAAMKg909tDwcUAAAAAAAAAAAAAAAAAAAAJwY+QwABivJx1AAAAABJRU5ErkJggg==###24x24:true###][Verfolgungsbild][https://github.com/notifications/beacon/AEDUA6MPJLFVDVETTKC65Z3UY75D3A5CNFSM4YZFHB32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOHU3IAGA.gif]
haha - no, but I did laugh when I saw that episode
The duplicate albums still get created though if you fundamentally change a filename (other than a case change) - or add new files to the album folder - then run a new/changed rescan. Does that need to be raised as a separate issue to keep things clear?
Could you please provide step-by-step instructions what I need to do to reproduce the problem?
What I usually did in order to produce the duplicate DB entries (but with 8.3, the behavior seems to be different):
As I mentioned above, this behavior seems to be different with LMS 8.3:
It seems in LMS 8.3 now it works as expected.
But what about same albums with different years? (They sometimes exist, e.g. re-releases, and the release info is only present in comment tag)?
Just done the same test as above with v8.3 and confirm the same result. Added a new album with one file having a different date tag It creates one album not two (as it did in 8.2)
So the problem is now just with the album tag
If I add new tracks into an existing album folder - even if the album tag is identical to the existing album tags in the same folder, it still creates a new duplicate album in the db for the new tracks
to test
The behaviour is sort of understandable, as the existing tracks aren't new or changed, but the folder contents have changed
I don't know how to fix it - maybe the scan needs to look for new/changed subfolders (date modified on the folder) and rescan the whole folder? or when it sees new/changed files it triggers a rescan of the whole subfolder that the new files sit in?
Not sure if either of these are viable
Also, with LMS 8.3, if I correct capitalization inside e.g. title tag and therefore, the file name also gets renamed (same name, different capitalization), something strange happens:
The album is not duplicated, but the song in question is, even though it's actually currently playing, not displayed correctly in the player, nor is the playlist of that album:
- Take any album that is already in the library
- Add a new track or tracks into the folder and tag with the same album tag as the existing tracks
- Run a new/change rescan
- New duplicate album is created with just the new tracks
This is working as expected here. Are you 100% certain album and artist information are absolutely identical? No upper/lower case issues? No whitespace?
Would you mind sharing the library.db with such an issue with me?
Can I install 8.2 over 8.3 in order to do that?
From: Michael Herger @.> To: Logitech/slimserver @.> CC: schnillerman @.>; Mention @.> Date: 02.02.2022 21:58:09 Subject: Re: [Logitech/slimserver] Duplicate title entries after library update (#547)
Take any album that is already in the library
Add a new track or tracks into the folder and tag with the same album tag as the existing tracks
Run a new/change rescan
New duplicate album is created with just the new tracks
This is working as expected here. Are you 100% certain album and artist information are absolutely identical? No upper/lower case issues? No whitespace?
Would you mind sharing the library.db with such an issue with me?
https://www.dropbox.com/request/T3RctyzGgNg0oFDubq6a
— Reply to this email directly, view it on GitHub[https://github.com/Logitech/slimserver/issues/547#issuecomment-1028351609], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AEDUA6IQPZCZXU4HDR6ZET3UZGLGDANCNFSM4YZFHB3Q]. Triage notifications on the go with GitHub Mobile for iOS[https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675] or Android[https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub]. You are receiving this because you were mentioned. [data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAD8AAAA/CAYAAABXXxDfAAAAAXNSR0IArs4c6QAAAARzQklUCAgICHwIZIgAAAAmSURBVGiB7cEBDQAAAMKg909tDwcUAAAAAAAAAAAAAAAAAAAAJwY+QwABivJx1AAAAABJRU5ErkJggg==###24x24:true###][Verfolgungsbild][https://github.com/notifications/beacon/AEDUA6O33MQXFPZH24WC6O3UZGLGDA5CNFSM4YZFHB32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOHVFWM6I.gif]
Can I install 8.2 over 8.3 in order to do that?
Why would you want to install the previous version? It's fixed in 8.3, not 8.2.
But to answer your question: yes, you can go back and forth as you like.
Can I install 8.2 over 8.3 in order to do that?
Why would you want to install the previous version? It's fixed in 8.3, not 8.2.
But to answer your question: yes, you can go back and forth as you like.
Sorry, Michael - the problem of duplicate albums by adding files to a folder or capitalization changes does not seem to be an issue in 8.3 anymore - at least from what I tried. That's why I thought that if you want that particular error, I would need to reproduce it in 8.2 - because that's where it definitely happened. Anyway - I'll try and reproduce the error I described above (https://github.com/Logitech/slimserver/issues/547#issuecomment-1028079064) and share library.db with you via PM. I understand that probably you have responded mainly to bobbydriver's comments, so please excuse my chipping in.
OK - getting closer to the problem now I think.
When you said you couldn't recreate the error by following the steps I described, i was surprised. So I ran through them again and i was even more surprised when I found that you were right - it added the new tracks to the correct (existing album)!
But i was sure that I had seen the issue only yesterday on the same 8,3 nightly, so I went back through the steps and managed to recreate the error in more specific circumstances
In the example, I'm using two Joy Division live shows. Both were partially included in a boxset some years ago, so I had them in my library as two separate albums, one for each partial live show.
Someone then shared remaining tracks which weren't included on the boxset and so I go to add them to each existing folder to complete the albums
In example 1 - I follow the original instructions I gave you. Added the extra files and tagged them to have the same album name as the existing files. The new files are the yellow ones and you can see the first 3 tracks are the old ones - unchanged
As mentioned - a rescan happily adds these to the existing album
In example 2 - I follow the same steps, with the only difference being that this time, when I load mp3tag to change the album tag on the new files, I highlight all the files and save the tags. This re-saves the existing tags to the existing files - even though none of them have actually changed. So now you see that the Date Modified is updated for ALL the files, but Date Created obviously stays the same for the original files
A rescan now creates the duplicate album issue
The original album with the original tracks
and the duplicate album with the additional tracks
They are both showing up as "New Music" so it's obviously changing the timestamp on the original album according to the date modified but why is it not adding the additional tracks as it does in example 1?
To add to my confusion, I tried another test case
See example3 - here I don't add any new files to the folder, i just change an mp3 tag on existing track 1 (which changes the Date Modified on this one file only)
I was expecting a rescan to create a new album for that one file, with the rest of the tracks remain in the old album
But it doesn't?! Just updates the existing album (see the altered title tag on track 1)
So question now is - what is the difference between example 2 and example 3. Why does it behave differently to the Date Modified change?
Would you mind sharing your library.db (with the above duplicate albums in it!)?
https://www.dropbox.com/request/T3RctyzGgNg0oFDubq6a
Without the database it's hard to tell what's going on there.
Will do - have tidied up the duplicates from yesterday so I will create a new test example and document for you, then upload my library.db and screenshots etc
Hmm - i now seem to have corrupted my library and it's triggered a full rescan - not ideal!
On the positive side, I think I've narrowed down the exact circumstances in which the issue now occurs
Most of the error modes from older versions seem to have been fixed - which is great
While I wait to get my library back - can you try this
Run a new/changed scan
What happens? For me I get 2 new albums created (one with the existing tracks and one with just the new track)
Thanks @bobbydriver! I received your files and will investigate. Can you confirm you're using the latest LMS 8.3?
Oh, I think I know what's going on: new tracks are scanned before the updated tracks. The new tracks therefore create a new album, because their album doesn't exist yet. Only once that's done the modified tracks would be updated. And as they already exist, the album referenced in the track would be updated, rather than the track linked to a new album. This causes the previous album to become a duplicate of the new one. That might become tricky to fix.
Ah ok - that makes sense, not sure how you fix that. I guess if it scanned updated before new files that would bring it's own problems?
And yes - I am on the latest 8.3 nightly (if it matters)
Yes, changing the processing order is the most obvious approach I'll investigate first.
Please let me know should you encounter any new side-effects. Thanks for your help identifying this long-standing issue, @bobbydriver!
@mherger , GREAT You fixed this, too! This has been an evergreen either (to me it always happened if there was a new bonus edition of an album and i added the new bonus tracks to it - and this is often nowadays! :) ) THANKS!
Thanks Michael! Testing it tonight. will let you know
Looking good to me - the problem is gone. I didn't think this would ever get fixed so THANK YOU so much!
Good to know! Sometimes it needs a fresh mind to look into these old issues 😉.
Sorry to interrupt you again guys, but LMS 8.3.0 - 1644170574 @ Sun 06 Feb 2022 07:24:08 PM CET is still creating duplicates for me.
Use case: _Capitalization change in tag albumartist and dir/file name
Re-scan results in 2 identical entries, both with all tracks:
One of the without display of artist Name:
One with display of artist name:
It seems that file name changes are registered as new files, too:
Can share library.db if required.
Did you change artist name in tag, folder and file name all at the same time? I haven't tried all three at once yet.
Did you completely delete library.db (not just wipe its content) in the past week? Some of the new behaviour require some table schema to be updated/re-created from scratch.
Yes, I'd be interested in your library.db in its broken state: https://www.dropbox.com/request/T3RctyzGgNg0oFDubq6a
I did not delete library.db, however did a full re-scan before I changed the files as described above.
Just dropped the library.db.
Now re-scanning with all files named library.* renamed and LMS restarted (triggered a re-scan).
I did the following:
Thanks for the uploaded file. As you confirmed it's not using the latest schema. It would still do case sensitive comparisons under certain circumstances.
I'll keep you updated as duplicates occur. For now, as the others already said: Huge thank you for dealing with this issue.
Whenever I update files that are registered in the library, a few of them are registered as duplicates (in the same album) in the library after a re-scan. The nature of the file update can be just (mp3-) tag updates, but also file renaming (directory name remains the same).
The only solution I have found to this is to rename or move the album's directory, re-scan, rename/move it back to its original state and do a second rescan.
Version: 8.2.0 - 1614990095 @ Sat Mar 6 01:43:25 CET 2021
This happens in earlier 8.x and 7.x versions as well.