ochurlaud / macaw-movies

Movie collection manager, for movie lovers. Qt/C++/Sqlite |
GNU General Public License v3.0
26 stars 4 forks source link

TMDb lookup should always treat file name periods as spaces #118

Open grokit2 opened 9 years ago

grokit2 commented 9 years ago

I have my movies named with periods instead of spaces in the file name so on the disk the movie "Alice in Wonderland" made in 2010 is "Alice.in.Wonderland[2010].mp4". Macaw-movies does not automatically update the metadata from TMDb but instead shows the 'Select movie' dialog with the correct version of the movie highlighted. Expected result: Automatically update metadata without user interaction. Screenshot attached

This did not happen for about 3/4 of my collection. It worked perfectly for 'A million ways to die in the West', 'A Scanner Darkly', but not for '300', '28 Days Later', 'Attack the Block' as some examples.

macaw-movie1

ochurlaud commented 9 years ago

@grokit2 : thx for your feedback.

The current implementation doesn't take in account the date. Therefore, it's just by chance that the good one is highlighted. If you had the Alice from 1915, the highlighted title would still be the one of 2010.

We don't have a reliable way to chose the way your movies are ordered (everyone does something different: in my case it would be %date- %title.

However, we could let the user give a pattern and base our algorithm on this, without too much problem.

@sebtouze

ochurlaud commented 9 years ago

@grokit2 : complement of infos...

Our algorithm to find titles can be found here https://github.com/macaw-movies/macaw-movies/blob/master/src/FetchMetadata/FetchMetadata.cpp#L105

What you can see, is that every ponctuation is converted to a space (so are the periods), and the numbers are whiped out (not to be polluted by the 720p and other unrelevant parts of the title), which of course is problematic for 28 Days Later which will be searched as Days Later (which is more common) and 300 which will be searched as (which matches everything =D.)

grokit2 commented 9 years ago

Well at least that explains the movies with numbers in the title. I also am finding that movie names with an apostrophe aren't working. Something like 'Assassin's Tale' wouldn't ever return anything with the current code. This issue is pretty trivial when adding a movie or two but I've got almost 700 to index and that takes a lot of time, :( Maybe make a few special cases? Changing 's to s before the QRegExp magic would be a quick improvement. Checking for 4 numbers between [], {} & () would help too but do you have time to go down that path?

ochurlaud commented 9 years ago

@Nanoseb said he would like to implement a more clever thing with what I wrote above and to check if the numbers look like a year or not.

If you see how to fix directly some of the troubles in the QRegExp, fill free to do a pull request.

grokit2 commented 9 years ago

I've just started teaching myself C++/Qt and still know nothing about git but this sounds like a good thing to get started with.

ochurlaud commented 9 years ago

Cool: if you need help, I'm often on freenode #macaw-movies or #qt

ochurlaud commented 9 years ago

@Nanoseb Maybe in the same time it could be feasible to begin to think about #67