damienhaynes / moving-pictures

Moving Pictures is a movies plug-in for the MediaPortal media center application. The goal of the plug-in is to create a very focused and refined experience that requires minimal user interaction. The plug-in emphasizes usability and ease of use in managing a movie collection consisting of ripped DVDs, and movies reencoded in common video formats supported by MediaPortal.
12 stars 6 forks source link

Noise Filter Enhancements #713

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
The noise filter is correctly handling file names including "Directors cut"
but not "Director's cut".

Original issue reported on code.google.com by hex...@gmail.com on 2 Jan 2010 at 1:05

GoogleCodeExporter commented 9 years ago
A temporary solution for your own setup would be to add "|Director\'s\sCut" to 
the 
end of your existing noise filter.

Original comment by RoChess....@gmail.com on 3 Jan 2010 at 7:12

GoogleCodeExporter commented 9 years ago

Original comment by conrad.john on 4 Jan 2010 at 4:52

GoogleCodeExporter commented 9 years ago
Any chance of including Unrated aswell?

Original comment by bjarteu...@gmail.com on 5 Jan 2010 at 5:09

GoogleCodeExporter commented 9 years ago
The following regular expression catches all variants of the Director's Cut:

(([\(\{\[]|\b)((576|720|1080)[pi]|[Dd]ir(ector[']?s )?[Cc]ut|dvd([r59]|rip|scr
(eener)?)|(avc)?hd|wmv|ntsc|pal|mpeg|dsr|r[1-5]|bd[59]|dts|ac3|blu(-)?ray|[hp]
dtv|stv|hddvd|xvid|divx|x264|dxva|(?-i)FEST[Ii]VAL|L[iI]M[iI]TED|[WF]
S|PROPER|REPACK|RER[Ii]P|REAL|RETA[Ii]L|EXTENDED|REMASTERED|UNRATED|CHRONO|THEAT
R[Ii]
CAL|DC|SE|UNCUT|[Ii]NTERNAL|[DS]UBBED)([\]\)\}]|\b)(-[^\s]+$)?)

Original comment by RoChess....@gmail.com on 27 Jan 2010 at 2:51

GoogleCodeExporter commented 9 years ago
bjarteus14, you can just change |UNRATED| into |UNRATED|Unrated| and it will 
work on 
that.

However that can not be added into the default one, because it will also filter 
it 
out from the movie title, for example "Unrated (2009).avi" would be killed then.

Now it would be possible to ensure proper filtering by looking for "(Unrated)", 
in 
order to do that you can add |\(Unrated\)  to the end of the existing noise 
filter 
(this is also explained in the FAQ).

Original comment by RoChess....@gmail.com on 27 Jan 2010 at 3:15

GoogleCodeExporter commented 9 years ago
Simple change so promoting this to high priority for version 1.1.

Original comment by conrad.john on 9 Feb 2010 at 5:51

GoogleCodeExporter commented 9 years ago
The usage of 720p24, or 1080i30 is getting commong as well, so expanded the 
default 
noise filter to match any optional 2 digits following the existing 
576/720/1080p (or 
i) expression.

(([\(\{\[]|\b)((576|720|1080)[pi](\d{2})?|[Dd]ir(ector[']?s )?[Cc]ut|dvd([r59]
|rip|scr(eener)?)|(avc)?hd|wmv|ntsc|pal|mpeg|dsr|r[1-5]|bd[59]|dts|ac3|blu(-)?ra
y|
[hp]dtv|stv|hddvd|xvid|divx|x264|dxva|(?-i)FEST[Ii]VAL|L[iI]M[iI]TED|[WF]
S|PROPER|REPACK|RER[Ii]P|REAL|RETA[Ii]L|EXTENDED|REMASTERED|UNRATED|CHRONO|THEAT
R[Ii]
CAL|DC|SE|UNCUT|[Ii]NTERNAL|[DS]UBBED)([\]\)\}]|\b)(-[^\s]+$)?)

Original comment by RoChess....@gmail.com on 6 Mar 2010 at 1:41

GoogleCodeExporter commented 9 years ago
Ok, the old Noise filter creates a mass array of RegExp catches, which is hard 
to 
debug in a tool such as "The Regulator" (also wastes memory). I fixed this by 
using 
RegExp group identifiers. With assistence of mitjaskuver on IRC, I also added 
in new 
names, such as BDRIP, or BDSCR, and a few others. Lot of care was taken to 
ensure 
that it will only filter out the non-title stuff, as to not break anything that 
worked before.

(?:(?:[\(\{\[]|\b)(?:(?:576|720|1080)[pi](?:\d{2})?|[Dd]ir(?:ector[']?s 
)?[Cc]ut|dvd
(?:[r59]|rip|scr(?:eener)?)|bd(?:rip|scr(?:eener)?)|(?:avc)?
hd|wmv|ntsc|pal|mpeg|dsr|r[1-5]|bd[59]|dts|ac3|blu(?:-)?ray|[hp]
dtv|stv|hddvd|xvid|divx|x264|dxva|(?-i)FEST[Ii]VAL|L[iI]M[iI]TED|[WF]
S|PROPER|REPACK|RER[Ii]P|REAL|RETA[Ii]L|EXTENDED|REMASTERED|UNRATED|CHRONO|THEAT
R[Ii]
CAL|DC|SE|UNCUT|[Ii]NTERNAL|[DS]UBBED|SCREENER|TELE(?:CINE|SYNC)|L[Ii]NE|[\[\(\{
\s\.]
T[CS][\]\)\}\s\.])(?:[\]\)\}]|\b)(?:-[^\s]+$)?)

Original comment by RoChess....@gmail.com on 10 Mar 2010 at 2:26

GoogleCodeExporter commented 9 years ago
Yup, great noise filter. We figured and RoChess added some tags that were 
probably
causing some problems until now. All i can say is that i tested it on a wide 
range of
files and it did awesome job!

As a side note, we also found an error while importing multipart files with 
DVDS in
the file name (as DVDScr or DVDSCREENER). Perhaps separate issue would be 
better for
that.

Original comment by mitja.skuver on 10 Mar 2010 at 2:44

GoogleCodeExporter commented 9 years ago
Mitja Skuver pointed out there are also "DiRECTORS.CUT" variants, and I noticed 
I 
added a case-sentive option to a section of the RegExp that is case-insensitive.

(?:(?:[\(\{\[]|\b)(?:(?:576|720|1080)[pi](?:\d{2})?|dir(?:ector[']?s[\s\.])?cut|
dvd
(?:[r59]|rip|scr(?:eener)?)|bd(?:rip|scr(?:eener)?)|(?:avc)?
hd|wmv|ntsc|pal|mpeg|dsr|r[1-5]|bd[59]|dts|ac3|blu(?:-)?ray|[hp]
dtv|stv|hddvd|xvid|divx|x264|dxva|(?-i)FEST[Ii]VAL|L[iI]M[iI]TED|[WF]
S|PROPER|REPACK|RER[Ii]P|REAL|RETA[Ii]L|EXTENDED|REMASTERED|UNRATED|CHRONO|THEAT
R[Ii]
CAL|DC|SE|UNCUT|[Ii]NTERNAL|[DS]UBBED|SCREENER|TELE(?:CINE|SYNC)|L[Ii]NE|[\[\(\{
\s\.]
T[CS][\]\)\}\s\.])(?:[\]\)\}]|\b)(?:-[^\s]+$)?)

Original comment by RoChess....@gmail.com on 10 Mar 2010 at 10:08

GoogleCodeExporter commented 9 years ago
Armandp had some cleanup suggestions, and also added in a lot of new filtering 
for 
the all CAPS section which should cause no false-match.

(?:(?:[\(\{\[]|\b)(?:(?:576|720|1080)[pi](?:\d{2})?|dir(?:ector[']?s[\W])?cut|dv
d[-]?
(?:[r59]|rip|scr(?:eener)?)|(?:b[dr]|sat|dvb)[-]?(?:rip|scr(?:eener)?)|(?:avc)?
hd|wmv|ntsc|pal|mpeg[4]?|dsr(?:ip)?|r[1-5]|bd[59]|dts|ac3|blu[-]?ray|[hp]dtv|stv
|hd[-
]?dvd|xvid|divx|x264|dxva|remux|(?-i)FEST[Ii]VAL|L[iI]M[iI]TED|[WF]
S|PROPER|REPACK|RER[Ii][Pp]|REAL|RETA[Ii]L|EXTENDED|REMASTERED|UNRATED|CHRONO|TH
EATR
[Ii]CAL|DC|SE|UNCUT|[Ii]NTERNAL|[DS]UBBED|SCREENER|DOCU(?:MENTARY)?|TELE(?:CINE|
SYNC)
|(?:RE)?[MF][Ii]XED|VERS[Ii]ON|ED[Ii]TION|HDTV|L[Ii]NE|OAR|AVC|T[CS])(?:[\]\)\}]
|\b)
(?:-[^\s]+$)?)

It works for me, but hopefully somebody will provide seperate verification on a 
large sample, so that it can be included in code.

Original comment by RoChess....@gmail.com on 12 Mar 2010 at 6:26

GoogleCodeExporter commented 9 years ago

Original comment by apond...@gmail.com on 12 Mar 2010 at 6:34

GoogleCodeExporter commented 9 years ago

Original comment by apond...@gmail.com on 17 Mar 2010 at 10:48

GoogleCodeExporter commented 9 years ago

Original comment by apond...@gmail.com on 17 Mar 2010 at 10:48

GoogleCodeExporter commented 9 years ago

Original comment by conrad.john on 31 Jan 2011 at 1:22

GoogleCodeExporter commented 9 years ago
Is this still good Rochess?

Original comment by damien.haynes@gmail.com on 9 Jun 2014 at 5:27