clementine-player / Clementine

:tangerine: Clementine Music Player
https://www.clementine-player.org/
GNU General Public License v3.0
3.73k stars 674 forks source link

Duplicate track detection not precise enough #4769

Open Geocacher opened 9 years ago

Geocacher commented 9 years ago

It appears that detection of a duplicate track is done by looking at either the track name or the metadata. This means that Clementine (Version 1.2.3-1039-g705cdf1) may detect a duplicate when none exists. In the case of Simon & Garfunkels album, Bookends, two tracks have the same name. However; they are different sizes and the "sonic ID" would be different. Would suggest that this type of comparison should be implemented.

ArnaudBienner commented 9 years ago

I don't think this is a high priority item. It wouldn't be easy to implement, and not as fast as the check we currently use (based on metadata indeed) and only very few usecases would benefit from this.

Ferroin commented 9 years ago

On 2015-03-09 08:46, ArnaudBienner wrote:

I don't think this is a high priority item. It wouldn't be easy to implement, and not as fast as the check we currently use (based on metadata indeed) and only very few usecases would benefit from this. At the very least we should be checking more than just the title. If we would check title, album, and optionally artist, then we could cover better than 99% of the use cases with almost no extra overhead.

Geocacher commented 9 years ago

Thank you for the reply. I have rather large music collection and come across this issue fairly regularly. I agree that it is not a high priority, but it would be a useful option at some point.

Enviado desde mi iPhone; Gary

On Mar 9, 2015, at 6:46 AM, ArnaudBienner notifications@github.com wrote:

I don't think this is a high priority item. It wouldn't be easy to implement, and not as fast as the check we currently use (based on metadata indeed) and only very few usecases would benefit from this.

— Reply to this email directly or view it on GitHub.

ArnaudBienner commented 9 years ago

@Ferroin are you sure about what you're saying? IIRC we check for all metadata (title, artist, album name).

Ferroin commented 9 years ago

I haven't looked at the code itself, but based on my experience either we don't check all of the fields, or it misbehaves when some of the aren't filled out. While I haven't tested it recently, I do know that in the past I've had issues with it detecting tracks as duplicates when the title and artist are the same, and the album name differs only by a few characters.

ArnaudBienner commented 9 years ago

Actually, looking at the code: https://github.com/clementine-player/Clementine/blob/1e752ce99cd0d0045ffe39203f904fb2e409aa18/src/playlist/playlist.cpp#L2088 https://github.com/clementine-player/Clementine/blob/c8b8556cfca2cd10c2ae1d2008f2a75e2abe6e0e/src/core/song.cpp#L1081 We currently check title and artist only. But I'm not sure checking other things would be a good idea: while it will be better in your point of view, in some usecase it will be worse e.g. some track from an album and from a compilation will not be removed.

Geocacher commented 9 years ago

It’s not as uncommon as one might think to have the same track name on a given album. Some artists will do variations on a theme, but name the tracks the same. I can give several examples.

Regards; Gary

On Mar 9, 2015, at 7:28 AM, ArnaudBienner notifications@github.com wrote:

Actually, looking at the code: https://github.com/clementine-player/Clementine/blob/1e752ce99cd0d0045ffe39203f904fb2e409aa18/src/playlist/playlist.cpp#L2088 https://github.com/clementine-player/Clementine/blob/1e752ce99cd0d0045ffe39203f904fb2e409aa18/src/playlist/playlist.cpp#L2088 https://github.com/clementine-player/Clementine/blob/c8b8556cfca2cd10c2ae1d2008f2a75e2abe6e0e/src/core/song.cpp#L1081 https://github.com/clementine-player/Clementine/blob/c8b8556cfca2cd10c2ae1d2008f2a75e2abe6e0e/src/core/song.cpp#L1081 We currently check title and artist only. But I'm not sure checking other things would be a good idea: while it will be better in your point of view, in some usecase it will be worse e.g. some track from an album and from a compilation will not be removed.

— Reply to this email directly or view it on GitHub https://github.com/clementine-player/Clementine/issues/4769#issuecomment-77852620.

Geocacher commented 9 years ago

Just a bit of a clarification. I am not suggesting that it be the default method of detecting duplicates, since the current method often works. However; when one needs to figure out if a track is actually a duplicate and not a variation within the same album, this would be very handy.

ArnaudBienner commented 9 years ago

I'm not sure what we can do here: if we make the current method more restrictive (more field checked when looking for duplicate) some people will be unhappy because some tracks will not be removed. If we check only a few (as we currently), too many tracks are removed.

I don't think there is a way to make everyone happy.

Ferroin commented 9 years ago

On 2015-03-09 14:13, ArnaudBienner wrote:

I'm not sure what we can do here: if we make the current method more restrictive (more field checked when looking for duplicate) some people will be unhappy because some tracks will not be removed. If we check only a few (as we currently), too many tracks are removed.

I don't think there is a way to make everyone happy.

Possibly add an option to control which fields get matched against?

ArnaudBienner commented 9 years ago

I was sure someone was going to propose that ;) IMHO this is really making things over complicated/confusing to average user, but maybe I'm wrong

Geocacher commented 9 years ago

Seem that this is turning out to be a contentious issue. I certainly didn’t want to create one.

I think that adding the ability to select metadata for the comparison would not really address the issue I reported. I was hoping that a tool could be added to list any “album" tags with duplicate sonic ID’s. This situation should never occur unless there is a legitimate track duplication or the metadata “ album” name is wrong. I suggest it be a tool, in order to avoid the overhead associated with making it a default mode of operation.

Regards; Gary

On Mar 9, 2015, at 12:32 PM, ArnaudBienner notifications@github.com wrote:

I was sure someone was going to propose that ;) IMHO this is really making things over complicated/confusing to average user, but maybe I'm wrong

— Reply to this email directly or view it on GitHub https://github.com/clementine-player/Clementine/issues/4769#issuecomment-77914204.

Ferroin commented 9 years ago

On 2015-03-09 14:32, ArnaudBienner wrote:

I was sure someone was going to propose that ;) IMHO this is really making things over complicated/confusing to average user, but maybe I'm wrong I was more thinking something that could be set in the config file, but not directly from the UI, and then appropriately documented of course.

ArnaudBienner commented 9 years ago

I was more thinking something that could be set in the config file, but not directly from the UI, and then appropriately documented of course.

I don't see the point of doing something like this. That means that only very few users will be access this feature.

ArnaudBienner commented 9 years ago

I was hoping that a tool could be added to list any “album" tags with duplicate sonic ID’s.

I guess sonic ID's a kind of fingerprint, like the AcoustID we currently use sometime (for example to autocomplete song's tracks). That means we will have to compute those IDs. We don't currently: we only have metadata and filename, because we don't compute those kind of IDs when indexing files in Clementine's. Computing them on the fly when removing duplicates from playlist will be too slow. Computing them when indexing might be an option, but only if this can be reused in other interesting features, otherwise it would slow down the indexing process a lot, for a feature few people will need IMO.

Geocacher commented 9 years ago

Yes, it’s a tool that would be most useful in initially getting ones library in shape and then to maintain it periodically after that. (I meant to say Acoustic ID. It’s something that may even give rise to the ability to later validate tracks in a similar manner to MusicBrainz.) For that reason, I didn’t suggest it be fully integrated, but rather something that could be run when needed, or at least turned on or off.

Thanks for even considering it. Clementine is already far more suited to my music library than iTunes.

Regards; Gary

On Mar 9, 2015, at 2:03 PM, ArnaudBienner notifications@github.com wrote:

I was hoping that a tool could be added to list any “album" tags with duplicate sonic ID’s.

I guess sonic ID's a kind of fingerprint, like the AcoustID we currently use sometime (for example to autocomplete song's tracks). That means we will have to compute those IDs. We don't currently: we only have metadata and filename, because we don't compute those kind of IDs when indexing files in Clementine's. Computing them on the fly when removing duplicates from playlist will be too slow. Computing them when indexing might be an option, but only if this can be reused in other interesting features, otherwise it would slow down the indexing process a lot, for a feature few people will need IMO.

— Reply to this email directly or view it on GitHub https://github.com/clementine-player/Clementine/issues/4769#issuecomment-77930565.