Open Clementine-Issue-Importer opened 10 years ago
From goetzchrist on January 04, 2011 08:55:37
I don't know how does Clementine handle the moving of files, but I remember reading that Amarok did a md5sum of the files when adding them to the database. So when a new file was added, it checked if the md5sum of the new file was already in the database. This way the ratings where not lost when moving files, but not when moving the file and editing the tags at once.
From keirangtp on January 05, 2011 05:14:50
As far as I know Clementine bases it's matching on filenames so moving media breaks stuff.
Sum sounds very reasonable but I'm wondering how fast calculating those really is and how would it impact Clementine's efficiency.
From goetzchrist on January 05, 2011 13:13:36
I have done some tests, this is with a Intel 2160 @ 1.8 GHz:
type | # file | size | Kbps | Play time | time md5sum -----+--------+---------+------+-----------+------------- ac3 | 1 | 18 MB | 448 | 5:35 | real 0.093s mp3 | 1 | 9 MB | 320 | 3:05 | real 0.038s mp3 | 17 | 175 MB | 320 | 1:16:21 | real 0.746s mp3 | 238 | 2100 MB | 320 | 17:45:33 | real 40.299s ogg | 1 | 7 MB | 256 | 3:53 | real 0.034s ogg | 9 | 83 MB | 256 | 50:20 | real 0.457s flac | 1 | 45 MB | 620 | 10:11 | real 0.221s flac | 8 | 250 MB | 600 | 57:44 | real 1.383s flac | 42 | 1500 MB | 600 | 6:06:33 | real 25.978s rar | 1 | 700 MB | - | - | real 53.141s XviD | 1 | 700 MB | 700 | 2:18:00 | real 15.747s
An option to speed this up could be to do a check-sum of the first 100 kB or so.
The ability to rename or move the audio file without loosing the statics is in my opinion important. There are many reasons to rename/move files, like when tagging (and changing file name) with external software or organizing folders.
From john.maguire on January 06, 2011 03:27:14
Labels: Component-MusicLibrary
Blockedon: 1175
From keirangtp on January 06, 2011 03:40:48
Embedded ratings would be a possible solution. The questions is whether we can embed everything using the FMPS spec - rating, playcount, skipcount, score?
From goetzchrist on January 06, 2011 12:23:48
I remembered another problem that (I suppose) could be resolved I the song files had an ID, like the md5sum.
When moving or renaming files, Clementine forgets the manual associated cover art. It has to be set manually again. But when the files are restored as they where before, the cover art is automatically back.
Another problem is when there are tracks in a playlist (but from the library), and they are moved or renamed, the tracks disappear from the playlist, leaving an white empty space (screenshot attached).
Attachment: clementine removed item from path.png
From keirangtp on January 07, 2011 13:44:43
There is another bug about embedding covers which would help the issue just like emebedded ratings would.
From goetzchrist on January 07, 2011 14:59:39
But embedding covers is not always useful, and is not the solution for this problem. Many people have the cover image in the album folder, and won't take the time to emebedded them into the music file. Also, for example, for big size covers, like 1 MB or so, it is bad idea to emebedded the cover in each ~6 MB file from an album. It is better to have a single cover file in the directory.
Instead of trying to emebedded everything into the music file to solve this problem, better to identify the audio files. Using the MD5 hash function to identify the files in the library is what many programs do. Anyway, it has to be done only once every complete scan.
From marcansoft on February 03, 2011 03:20:58
I agree, I think hashing is the best solution. Not necessarily the entire file, just a chunk, but you also need to keep in mind that if the hash covers tags it will change when tags change.
I think it's worth looking at what Amarok 1.4 used to do here, I remember it working pretty well in this regard.
Currently I have to manually edit stuff in the sqlite database when I move files, which is a chore.
From Yizouse on March 19, 2011 00:25:37
I'm quite impressed with this project, but this issue is preventing me from full-time adoption; I simply rely on my files being "mobile". So I did some probing.
Amarok 1.4's collection.db has a "uniqueid" table containing URL/UID mappings. The statistics table uses the UID, allowing for the desired behavior.
This UID is an md5 wrought from:
I found this in A1.4's src/metabundle.cpp, readUniqueId(). The more involved logic of handling unique ids w/r/t file moving & copying is in src/collectiondb.cpp.
Now to look at the clementine source and see how easily a similar scheme would work with the existing code...
From davidsansome on June 13, 2011 16:24:49
Issue 1988 has been merged into this issue.
From tekdemo on August 27, 2011 14:55:49
Rather than embed things in files, or change the database, a logical behavior would be to add another step into the "organize files" procedure. Currently, it just moves the files, but it should do the following
This seems to be more on par with what Amarok used to do, as it never broke a playlist when organizing files. I frequently had Amarok break metadata if I moved the files manually, but never had an issue by moving things in the application itself.
From dragonfear on September 10, 2011 02:18:43
Amarok's file tracking is shit.
I propose you do this: http://code.google.com/p/clementine-player/issues/detail?id=2225 It's more robust and it incorporates forward-compatible support.
From dragonfear on September 10, 2011 02:27:33
Also, any solution proposed needs to:
My proposal takes those factors into account
From davidsansome on September 10, 2011 08:53:13
Issue 2225 has been merged into this issue.
From manuel@twilio.com on September 14, 2011 15:31:21
Spec I wrote for a decent file tracking mechanism: https://docs.google.com/document/d/16n8YdBEYvvLNvbu2JMk43ZujpDLUZG8KvWn9dvUaekA/edit?hl=es
From hyperquantum on September 21, 2011 13:29:14
File tracking would be a really great feature.
A similar problem is that of duplicate tracks. What if you have the same song more than once in your library? You wouldn't want each instance of a song to have a separate play count and rating. Instead, the play count and other statistics should be shared for all instances. (In practice, you can keep separate play counts for each file and display the sum of those individual counts as play count in the playlist)
Several kinds of duplicate tracks are possible:
1) equal files: two files are completely equal. The user accidentally copied a song to another location in the library. Or: the library covers both hard disk and removable storage, and the user has inserted a removable medium that contains one or more files that were originally copied from the hard disk.
2) equal files but with different tags. The files contain the exact same audio data, but some tags differ. Most likely, this started out as scenario 1, but then the user fixed an error in the tags of one file while being unaware there were multiple instances of the file.
3) different files that represent the same track. This happens when you rip your CD's and there are songs that are present on more than one CD.
(1) and (2) can be solved by using some kind of hash on the audio data of the file, while ignoring data from tags. (3) could be done manually by looking for files with similar names and/or tags and asking the user if those files represent the same song or not. Or some kind of acoustic fingerprinting can be implemented to see if two audio files 'sound' the same.
I wrote about checking for duplicate tracks here because IMHO it is another instance of file tracking. If you do duplicate track checking, you can detect when a file is renamed because the file with the new name has the same audio signature as the file that has just 'disappeared'.
From dragonfear on September 21, 2011 13:36:28
Yup, if you noted my proposal, it has that concept already incorporated by supporting multiple levels of hashing.
From davidsansome on September 29, 2011 14:02:30
Issue 2266 has been merged into this issue.
From hyperquantum on December 01, 2011 01:56:10
Writing the playcount into the file tags is NOT a good idea if you ask me. (I say this because I saw issue 2266 merged into this one)
What if the file is shared between multiple users on the same computer, or maybe shared over the local network? Would all users share the same count? So when Alice listens to the file, Bob sees his play count increased?
You might say "well we'll keep a separate play count for every user then". But what if someone has different login names for different computers in the network? It would cause his playcounts to split between different user names.
File tracking is the way to go.
From chris.scotland on August 28, 2012 12:30:51
I like the idea of making use of unique musicbrainz hash tags where present.
Any hash that is calculated should be calculated based on the audio content of the file alone and not based on the tags or embeded cover image as these may change through the life of the file (corrections, additional information etc).
From grjordan on October 30, 2012 21:11:39
Would the music brainz tags be unique though? Isn't the musicBrainz ID just based on it being a specific song by a specific artist? There are a few issues I see with this:
I think a simple hash of the portion of the file that doesn't have the tag informaiton (don't forget mp3 gain info) would achieve what we want. And to speed it up, make it only hash part of that plus the file size. (So that audio clips won't be treated as copies).
From schizosfera on October 31, 2012 01:27:20
as long as we talk about the same audio file (which was moved), a (relative simple) hash of the audio content would suffice. the player just has to recognize that it's the same audio data as it has "seen" it before.
replacing an audio file with another of different quality does not actually fall in the "movig a file" category. in such a case the player would need to use some audio fingerprinting instead of simple hashing.
From tommy.carstensen on February 01, 2013 11:55:21
I just lost all my play counts and ratings :( So I would like to support the ideas listed above.
From cavez.m on February 04, 2013 14:30:29
Just started using Clementine and I lost a few tracks coping from the playlist to an external drive. I didn't want to move the file I just wanted to copy it and now it's lost. Any idea where I could find it? I'm using a mac.
PS I really like the interface but I don't want to lose my tracks, it would be good if when I drag the playlist file onto an external drive they are just copied instead of moved all together. Also the cmd-c shortcut to copy stuff doesn't seem to work.
From alphadeltapapa on February 20, 2013 17:15:27
I think that's not a good idea. Two recordings of the same piece might be similar yet different, and the user might want to assign different ratings. I think one should be careful not to make software too smart for its own good.
From schizosfera on February 22, 2013 01:37:58
@26:loosing data is always a bad idea imho. in the situation of "similar yet different" recordings you would probably only have problems if you move both audio files simultaneously. imho having twice zero ratings is far worse compared to having twice a X rating.
From zzanzare on February 22, 2013 02:30:00
As suggested here at the very start of this thread: "do a check-sum of the first 100 kB or so"
This will solve the moving problem and also work for the songs that are "same yet different" even if you move them simultaneously.
I must say that I've been waiting for this fix since I started using Clementine. With this issue I cannot use the scores/ratings at all, because they are unreliable. It would be really nice if this ticket got more attention from the developers.
But anyway, Clementine rocks and I recommend it wherever I go..
From tristan.miller@nothingisreal.com on February 28, 2013 03:48:54
No it won't. If you're going to do a hash of the audio data, you need to use all of it, not just "the first 100 kB or so". Lots of audio files have identical data at the beginning (for example, the album, single, and radio edit versions of a song).
From arnaud.bienner on February 28, 2013 04:33:07
FYI I started to work on issue 1175 , to allow user to save ratings/statistics directly into files' tags. This will provide a solution to not loose ratings/statistics when moving files. However, the counterpart is that once activated, it will modify your files every time something change, which would not be the case if we use some kind of fingerprint, as suggested above. But I'm not sure it's worth to implement this issue once issue 1175 will be fully implemented.
From hyperquantum on February 28, 2013 04:46:32
@arnaud:
Saving such data in the files directly seems like the wrong approach to me. You cannot assume that a file belongs to one unique user and that the file will be writable.
And hashing would have the additional advantage that it can detect and correctly handle duplicate songs.
From arnaud.bienner on February 28, 2013 04:54:02
On the other hand, saving this data into the file enable you to keep your data even if you use another player (as long as it can read these tags) or use another instance of Clementine (on another computer for example).
So issue 1175 is slightly different indeed. And so we can keep this issue open even if issue 1175 will provide a reasonable work-around IMO, waiting for this to be implemented. But I don't think I will work on this soon, and I'm not sure someone else in the team has planned to work on this. But if you're interested, patches are more than welcome :)
From goetzchrist on February 28, 2013 11:31:06
Saving ratings/statistics into the file solves some issues, but I see a particular problem right now: When adding files (from the library or not) to the playlist, and then renaming or moving the file (changing file path), the tracks are no longer accessible from the playlist. They have to be searched and added again to the playlist.
From arnaud.bienner on February 28, 2013 11:39:09
Indeed, but that's normal and is a different issue; and the library will be updated anyway.
However, with some kind of fingerprint of the song, we might be able to detect that a song has been moved instead of considering it as a new one.
From wegwerf@abwesend.de on August 26, 2013 03:31:22
I just switched from Amarok to Clementine and automatic file updating in playlists is one of the few things that work like a charm in Amarok 2: you rename/move a file that is currently in your playlist, the change is noticed and after a few seconds the file path is updated. Score, rating and play count are preserved. This also works for saved playlists.
I rename/move files a lot and its very important for me to have this feature. The database is used as it only works if a file is moved inside the music library. And I assume a hash of the music data is used because changing tags doesn't break the identify mechanism. If somebody wants I can try to find out the details.
And I finally agree that writing anything to files for that is unnecessary and should not be done!
From arnaud.bienner on August 26, 2013 04:55:58
I would be curious to know how it works in Amarok 2. If you have some times to look at this, you can post your results here: this would be helpful to have some ideas about how to implement this feature if one day someone wants to start working on this.
From hector@marcansoft.com on August 26, 2013 13:27:03
It's been a while, but I seem to recall that the way it worked in Amarok 1 was by storing a hash of a chunk of the beginning + the end of the audio data. That's unique in almost all cases and fast to compute.
From wegwerf@abwesend.de on September 12, 2013 07:57:19
Sorry for the delay, I was on vacation. Finding information about Amaroks file tracking is very easy, it's written up all here: http://community.kde.org/Amarok/Development/AFT Short summary: a md5sum is saved in database created by file size, tags and the first bytes of the file (pretty simple). If the file is modified and moved simultaneously outside Amarok tracking will fail. Embedding the id into the file is also supported but a helper program must be used manually for that.
All in all its not as good as I thought. Using [musicbrainz + file type + bitrate] and saving the resulting hash to database is a better approach in my opinion.
I agree with this.
I keep a "never played" playlist to play the new music, and I just broke that playlist by recategorizing some discographies I already had listened.
File tracking is a very important thing to have in a library-oriented music player.
I'd like to support this. I'd really like to reorganise my music folders, mass-rename files and strip some ID3 tags. But I cannot afford to lose ratings and playcounts.
My ideal solution, I think, would be based on audio hashing + song length, resorting to/from file path as fallback. We can accept that playlists / library is broken while changed files are being rehashed (this already happens if files are moved).
How about: 1. Warning the user that when he or she organizes his files attributes like rating etc will be lost, and 2. letting the user specify the %ratings (and other non taggable attributes) keyword in the organize files dialog (it's not available). This way, the rating will at least be preserved in the new file name or folder name (i.e. %genre/%artist--%title--Rating:%rating.%extension), and the user has the alternative to filter the playlist with "Rating:5.0" and massively reset the rating or other attributes.
I meant my last post as a measure until they solve this issue in some other way.
I can accept that some files I change are losing ratings. But some files I've not changed are losing ratings and that's just frustrating. Clementine in its current state just isn't reliable.
If somebody is still interested: #5135
Wouldn't using checksums of accompanying moodbar-files, which basically are fingerprints, be perfect for that ? Even replacing filepaths in playlists with them. Moodbar really is the best thing that ever happened to music players.
This bug means that you can either use the library management functionality or dynamic playlists based on ratings (and last played time) not both. I think both are major features of this player. For me this is a blocker in order to switch over from amarok.
Can you please merge the #5135 !
From Gavin77 on January 04, 2011 17:26:06
What steps will reproduce the problem? 1. I have 2 main folders both included in the library, first is music I need to backup, 2nd is my main archive.
Original issue: http://code.google.com/p/clementine-player/issues/detail?id=1241