damienhaynes / moving-pictures

Moving Pictures is a movies plug-in for the MediaPortal media center application. The goal of the plug-in is to create a very focused and refined experience that requires minimal user interaction. The plug-in emphasizes usability and ease of use in managing a movie collection consisting of ripped DVDs, and movies reencoded in common video formats supported by MediaPortal.
12 stars 6 forks source link

Scraper Request: Rotten Tomatoes (primarily for ratings) #570

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Import movie ratings and summaries from rottentomatoes as a first 
preference with IMDB as the second choice. 

http://uk.rottentomatoes.com/

Reference:

as per thread started by Surferosa

http://forum.team-mediaportal.com/moving-pictures-284/feature-request-
rottentomatoes-72674/

Original issue reported on code.google.com by CrazyInd...@gmail.com on 31 Oct 2009 at 7:52

GoogleCodeExporter commented 9 years ago
As mentioned in the forum thread, the real problem here is matching results 
from 
Rotten Tomatoes to results from another data provider. I think the best option 
would 
be to create a full scraper for Rotten Tomatoes and just use that. This however 
would 
mean less movie information overall though. If the Rotten Tomatoes scraper was 
somehow able to determine the correct IMDb ID, then it would make things much 
much 
easier.

So based on this I TENTATIVELY schedule this for 0.9. I can't garantee we will 
create 
this script though. And if anyone gets impatient and wants to try to take a 
crack at 
this script, please post the results here. :P

Original comment by conrad.john on 14 Nov 2009 at 6:38

GoogleCodeExporter commented 9 years ago
Just found out that you can access a rootentomatoes page with imdb id. See
http://www.rottentomatoes.com/help_desk/webmaster.php (Imdb linking). 

Basically what you need to do is take the imdb id, remove the trailing tt and 
insert
the id into this url:
http://www.rottentomatoes.com/alias?type=imdbid&s=insert_imdbid_here

examples:
Up: http://www.rottentomatoes.com/alias?type=imdbid&s=1049413
9: http://www.rottentomatoes.com/alias?type=imdbid&s=0472033
Star Trek: http://www.rottentomatoes.com/alias?type=imdbid&s=0796366

Original comment by bgmei...@gmail.com on 15 Nov 2009 at 7:34

GoogleCodeExporter commented 9 years ago
Excellent find bgmeiner, this will make integration much much easier!

Original comment by conrad.john on 15 Nov 2009 at 8:00

GoogleCodeExporter commented 9 years ago
Genius! Ive never seen that before!

If possible- I was sort of hoping that this could be used as additional 
information
in MP; rather than a full scraper replacement for imdb.  I had this idea that 
you
could select critics that you like in a master list; then it would scrape for 
each
one in turn until it found a review from that list and then used that in a 
'review'
field(s) within MP.  So Id stick in something like;

1. EmpireMagazaine
2. Total Film
3. Sunday Times (UK)

If it couldnt find them- maybe it picks a review at random.  Skinners could 
then have
access to a 'Review' option, etc.. You get the idea..

That and obviously the 2 scores RT provides (T-Meter & Average Score).

Thanks again guys..

Original comment by surferos...@gmail.com on 16 Nov 2009 at 8:07

GoogleCodeExporter commented 9 years ago

Original comment by conrad.john on 29 Nov 2009 at 8:12

GoogleCodeExporter commented 9 years ago
I modified the imdb.com scraper v1.3.3 to obtain the ratings from 
RottenTomatoes.com

Since they have both an Average Rating as well as the TomatoMeter Rating, I 
decided 
to make 2 seperate versions.

Unfortunatly I haven't been able to actually test the scrapers yet, as wife is 
hogging the HTPC again, but I don't forsee any issues as the 'extra' code is 
pretty 
basic.

Original comment by RoChess....@gmail.com on 3 Jan 2010 at 10:23

GoogleCodeExporter commented 9 years ago
OK- I may be doing something incredibly stupid- but I'm getting the error 'The 
Script
is malformed or not a Moving Pictures script'.  

Using latest MP Beta (1.0.1.1001) with MediaPortal 1.0.4.23491.

Original comment by surferos...@gmail.com on 4 Jan 2010 at 6:42

GoogleCodeExporter commented 9 years ago
Ugh, since it was such basic code, I thought I could contribute without testing 
it. 
Should have known Murphy's law would get me, and I overlooked the & into & 
conversion in the URL.

All fixed now, and this time tested and verified.

Original comment by RoChess....@gmail.com on 4 Jan 2010 at 7:17

Attachments:

GoogleCodeExporter commented 9 years ago
Update- the revised script is partially working.  Following errors reported:

1. Intermitent RT Scrape Failure
I can only put this one down to some sort of time-out failure, but approx 1 
film in
every 5/6 doesnt seem to have updated from my batch scrape. As soon as I 
manually
refresh info from the internet for that film individually- it works. Its 
difficult to
say how wide-spread this is as I have to go and check each movie individually 
to make
sure it has updated correctly (ie I have no way of knowing whether the scrape 
has
worked, or whether I need to manually retry). Couple of thoughts on this issue;

-would it be possible to overwrite the existing Score field with a Null value 
before
performing the RT Score lookup. That way I could simply look down the list of 
movies
and perform a count of the Nulls. Failing that, if it could overwrite a 
different
field (tagline?) with 'RT Score found = x time' so that I can see it worked, 
and when
(ie on which run).

-if it is a time-out issue (and thats only a theory mind), would it be possible 
to
try reparsing the page multiple times (ie if no page returned, retry upto 3 
times).

2. 100% scores
I think the scraper is failing when RT scores 100%. Two examples I have found 
are
aliens and man on wire. If this is the case- an easy fix I would imagine?

3. N/A Scores
Some films arent scored yet ($5 a day) and get a RT score of N/A. Dont know 
what we
should do with these. At present, my database simply retined the score of 7 
from imdb
that the old scraper had returned- but for someone on a new movie import this 
would
simply fail. Im guessing N/A isnt a value that the MP db can hold, so no ideas 
what
to do here...

4. Bad RT Links?
Ive found a film away we go that doesnt link correctly when using the RT link 
with
the imdb no. Using the imdb no for this -1176740- returns you to farlanders. 
Again-
not sure you can fix this one, so I may contact RT and see if its an error. 
Also,
difficult to know how many of these there are without knowing which ones have / 
have
not updated sucessfully.

Also some work completed on scraping the reviews themselves- however where to 
store
the data and how to display remain unresolved.

http://forum.team-mediaportal.com/moving-pictures-284/imdb-scraper-rottentomatoe
s-rating-75637/#post556022

Original comment by surferos...@gmail.com on 6 Jan 2010 at 2:12

GoogleCodeExporter commented 9 years ago
1. I'll modify the timeout value on the Retrieve node to a higher value, see if 
that 
works. It seems RottenTomatoes.com on occasion can take quite some time to do 
the 
IMDb ID lookup. There is an update planned for the Movie Details overview that 
allows filtering, so assuming they allow all the fields to be used, you will be 
able 
to find out all the movies that lack a rating. Until then, you could use SQL 
queries 
manually.

2 + 3. I had no examples of such movies, so I didn't know RottenTomatoes used 
different HTML code. I will add extra Regular Expression checks to fix this, 
please 
be patient. I've decided to use a rating of '0' for the N/A ones, so that you 
won't 
think the scraper failed to obtain the rating. I believe no movie exists on 
RottenTomatoes with an actual rating of '0', as even "Gigli" still scores a 0.6 
:o)

4. This is a problem with RottenTomatoes, so please report this to them to fix.

(5). There is an existing enhancement request for more fields, and reviews 
could be 
part of that. However it will most likely be a while before that is part of the 
Moving Pictures plugin, which is why I offered for a custom scraper that 
modifies 
the IMDb summary to add the RottenTomatoes reviews.

It was never my intention for these scraper contributions to become a default 
part 
of the plugin, but simply a way to bridge the waiting period until a more 
official 
solution was found.

Original comment by RoChess....@gmail.com on 6 Jan 2010 at 7:05

GoogleCodeExporter commented 9 years ago
Unfortuntaly the scraper that was looks like it doesn't work.  This could be a 
complete red herring, but it looked liked the scraper was parsing for a field 
"rotten_rating" which looks like it no longer exists in the xml source.  But 
you could write all my xml knowledge on the back of a stamp- so maybe its not 
this at all.

Either way, this was really useful.  As an aside, a site that i am a member of 
mow presents their ratings like this-

http://img18.imageshack.us/img18/8946/ratingso.jpg

Would be super if Moving Pictures could do the same some day (maybe 
incorporating some of the ideas RoChess had for critic summaries too).

Original comment by surferos...@gmail.com on 16 Jun 2010 at 10:32

GoogleCodeExporter commented 9 years ago
Hey guys, I like that idea...
But what would be realy great too is just the oposite way.

I always rate the movies I watched into MePo/Moving Pictures.
It would be great to push this rating to my rotten tomato account so I don't 
have to go to the website to rate it again.

Original comment by michael....@gmail.com on 17 Sep 2010 at 8:59

GoogleCodeExporter commented 9 years ago

Original comment by conrad.john on 31 Jan 2011 at 1:22

GoogleCodeExporter commented 9 years ago
For those unaware and that 'starred' this issue.

IMDb+ Scraper with RottenTomatoes support is available at: 
http://forum.team-mediaportal.com/moving-pictures-284/imdb-scraper-short-long-su
mmary-imdb-rt-score-us-uk-rating-more-93838/

I do love the idea that surferosa showed in that multiple ratings are available 
to a skinner to show as per: http://img18.imageshack.us/img18/8946/ratingso.jpg

But this would require the MovingPictures plugin to get support for storing the 
other ratings. Either via extra database fields, or perhaps supporting an array 
of the existing movie.score (and movie.popularity for the votes/count) fields 
as is done for genre/etc.

Since imdb.com now provides the metascore on a lot of movies, I will add this 
as an option to the IMDb+ scraper as well.

Original comment by RoChess....@gmail.com on 25 Feb 2011 at 4:59

GoogleCodeExporter commented 9 years ago

Original comment by damien.haynes@gmail.com on 1 Dec 2013 at 1:13