LMS-Community / slimserver

Server for Squeezebox and compatible players. This server is also called Lyrion Music Server.
https://lyrion.org
Other
1.16k stars 293 forks source link

Scanner: Changing large and lower case in FILE name results in double entries (on Windows) #705

Closed frank1969b closed 2 years ago

frank1969b commented 2 years ago

I know this is known - but with growing databases it becomes more and more important in my eyes:

If I change the large and lower case in a filename or foldername it results in double entries. e.g.: C:\A\ABBA\albumname\titlename.mp3 is a different track than C:\A\ABBA\albumname\Titlename.mp3 obviously for windows is case sensitive at this point, scanner is not, I guess.

So if I change this, it results in "titlename.mp3" AND "Titlename.mp3" resting in the database with the same tags.

For a while I tried to avoid this by only changing the tags when I do some changes on large and lower case and NOT renaming the file. But for I use mp3tag for tagging and the feature "tag - filename" that creates filenames automatically from my tags, I can't really avoid it.

MANY thanks again for caring.

REM: I thought this would be an "classic", too, but couldn't find any bug report either.

mherger commented 2 years ago

I could imagine that this would be Windows (and probably some macOS) only issue: internally we sometimes compare the hashes of the path, sometimes check the real path name. On a case insensitive file system the latter would succeed, while the hash comparison would fail.

frank1969b commented 2 years ago

@mherger , yes, I'm sure it's a problem of case(in)senstivity / windows, but it works in a wired way (I'll do some testes and look into the db afterwards).

If I change ABC.mp3 => abc.mp3 (filename)

ABC.mp3 (#1) and abc.mp3 (#2) are recognized by the scanner as 2 different tracks while ADDING - so it adds both to the database.

But when "removing" scanner should "normally" see (for they are different tracks to the database), that abc.mp3 (#1) is still there, while ABC.mp3 (#2) is gone and remove it from the db?

So it looks to me the adding part of the scanner is casesensitive, while the removing part isn't?

michaelherger commented 2 years ago

So it looks to me the adding part of the scanner is casesensitive, while the removing part isn't?

~I looked into the code again, and I don't think this would be true. But I was wondering whether the file would be recognised as modified (do you or the tool you're using change the timestamp?), but actually resulted in a new track being added. Next time you test this, could you please check the log file to see whether there had been any additions or only updates? It should be logged with default log settings.~

Never mind: you seem to be right. I changed one character in a filename from lower to uppercase:

[22-01-31 22:05:29.0501] Slim::Utils::Scanner::Local::deleteTracks (519) Removing deleted audio files (0)
[22-01-31 22:05:29.0502] Slim::Utils::Scanner::Local::__ANON__ (301) Scanning new audio files (1)
[22-01-31 22:05:29.0700] Slim::Utils::Scanner::Local::__ANON__ (381) Rescanning changed audio files (0)
michaelherger commented 2 years ago

Argh... we must have been dealing with this very issue 10 years ago already... without really fixing it? See the comment on the following line, which describes the URL field for the tracks table:

https://github.com/Logitech/slimserver/blob/949a06d93248c4a8746c81903bdb3c39b37b168c/SQL/SQLite/schema_16_up.sql#L6

It's made case insensitive. The same does not apply to the scanned_files table, though:

https://github.com/Logitech/slimserver/blob/public/8.3/SQL/SQLite/schema_11_up.sql

So when we look for tracks which are in scanned_files, but not in tracks, the comparison is case insensitive. If we do the opposite check, the lookup in scanned_files is case sensitive. Exactly what you figured out.

Fixing this in code will require a bump to the database schema. Which will require a rescan on an update. I try to prevent this, as it's uncomfortable in some situations. But you could try to modify the schema in your own library.db using something like SQLite Studio (https://sqlitestudio.pl): add the NOCASE collation to the url field in scanned_files. This solved the issue in my test situation.

If you don't care about running a full rescan you can update to the next 8.3 nightly, delete library.db* from LMS' cache and start over. This should fix the database schema.

frank1969b commented 2 years ago

@mherger I just had a look into the tracks table. Both tracks have completely identical values - with 5 exceptions: the id and url (as expected), then urlmd5 , added_time and updated_time

Timestamp, filesize are identically.

Next thing is I will dig the logfile before I will try to modify my db ( I'm btw using DB Browser for SQLite Version ) - I'm not sure if I do have to modify it after every nightly...?! Ah, just saw Your note about the full rescan - for I do a complete backup every day, I can do so...

frank1969b commented 2 years ago

Ah, just saw the r1643665089 already, will install it and do a full rescan tomorrow - and will report. THANKS!

frank1969b commented 2 years ago

@mherger , so, I took 1 day for re-scanning my "monster" and another day for some testing, as far as I can see, all 3 issues are gone:

  1. The "Pauline" issue
  2. The case(in)sensitive "ABC => abc" double issue
  3. The "You&me => You\me" issue (now the old "You&me"-artist is also deleted if "You" and "me" existed before).

So many MANY thanks for caring!

mherger commented 2 years ago

One more long standing issue down 😁.