tboothman / imdbphp

PHP library for retrieving film and tv information from IMDb
253 stars 84 forks source link

Extend method Recommendations #158

Closed duck7000 closed 5 years ago

duck7000 commented 5 years ago

I hope this is the right place to make a pull request, i don't understand how it al works

It's about issue #157 And this is a proposal to extend this method. I split up all the year possibilities and created a field for the text like tv series.

The output array is still the same with the exception of 2 extended fields, endyear and type

#-------------------------------------------------------[ Recommendations ]---
  /**
   * Get recommended movies (People who liked this...also liked)
   * @return array recommendations (array[title,imdbid,year,endyear,type])
   * @see IMDB page / (TitlePage)
   */
  public function movie_recommendations() {
    if (empty($this->movierecommendations)) {
        $this->getPage("Title");
        if ( preg_match_all('!<div class="rec-title">\s*(.*?)\s*</div>!ims', $this->page["Title"], $matches) ) {
            foreach ($matches[0] as $found){
                if (preg_match('!<a\s+href="/title/tt(\d+)/[^>]*>\s*(.+)\s*</a>!ims',$found,$match)){
                    $title = strip_tags($match[2]);
                    if (preg_match('!<span class="nobr">\s*(.*?)\s*</span>!ims',$found,$all)){
                        $temp = preg_replace('/[^0-9]/','',$all[0]);
                        $type = trim(preg_replace('/[^a-z\s]/i','',strip_tags($all[0])));
                        if(mb_strlen(trim($temp)) >4){
                            $year = trim(substr($temp, 0, 4));
                            $endYear = trim(substr($temp, 4));
                            $this->movierecommendations[] = array('title'=>$title,'imdbid'=>$match[1],'year'=>$year,'endyear'=>$endYear,'type'=>$type);
                        }
                        else{
                            $year = trim($temp);
                            $this->movierecommendations[] = array('title'=>$title,'imdbid'=>$match[1],'year'=>$year,'endyear'=>'','type'=>$type);
                        }
                    }
                    else{
                        $this->movierecommendations[] = array('title'=>$title,'imdbid'=>$match[1],'year'=>'','endyear'=>'','type'=>$type);
                    }
                }
                else{
                    return $this->movierecommendations;
                }
            }
        }
    }
    return $this->movierecommendations;
  }

Can you give your thoughts about it?

duck7000 commented 5 years ago

As a update i made the same method with dom/xpath

 #-------------------------------------------------------[ Recommendations ]---
  /**
   * Get recommended movies (People who liked this...also liked)
   * @return array recommendations (array[title,imdbid,year,endyear,type])
   * @see IMDB page / (TitlePage)
   */
public function movie_recommendations() {
    if (empty($this->movierecommendations)) {
        $doc = new \DOMDocument();
        @$doc->loadHTML($this->getPage("Title"));
        $xp = new \DOMXPath($doc);
        if ($cells = $xp->query("//div[@id=\"title_recs\"]/div[@class=\"rec_overviews\"]/div[@class=\"rec_overview\"]/div[@class=\"rec_details\"]/div[@class=\"rec-info\"]/div[@class=\"rec-jaw-upper\"]/div[@class=\"rec-title\"]")){
            foreach ($cells as $cell) {
                if(preg_match('!tt(\d+)!',$cell->getElementsByTagName('a')->item(0)->getAttribute('href'),$ref)){
                    $movie['title'] = trim($cell->getElementsByTagName('a')->item(0)->nodeValue);
                    $movie['imdbid'] = $ref[1];
                    if($span = $cell->getElementsByTagName('span')->item(0)->nodeValue){
                        $years = preg_replace('/[^0-9]/','',$span);
                        $type = trim(preg_replace('/[^a-z\s]/i','',strip_tags($span)));
                        if(mb_strlen(trim($years)) >4){
                            $movie['year'] = trim(substr($years, 0, 4));
                            $movie['endyear'] = trim(substr($years, 4));
                            $movie['type'] = $type;
                        }
                        else{
                            $movie['year'] = trim($years);
                            $movie['endyear'] = "";
                            $movie['type'] = $type;
                        }
                        $this->movierecommendations[] = $movie;
                    }
                }
            }
        }
    }
    return $this->movierecommendations;
}

ps the first method posted here contains a mistake.. Here is the corrected one

 #-------------------------------------------------------[ Recommendations ]---
  /**
   * Get recommended movies (People who liked this...also liked)
   * @return array recommendations (array[title,imdbid,year,endyear,type])
   * @see IMDB page / (TitlePage)
   */
  public function movie_recommendations() {
    if (empty($this->movierecommendations)) {
        $this->getPage("Title");
        if ( preg_match_all('!<div class="rec-title">\s*(.*?)\s*</div>!ims', $this->page["Title"], $matches) ) {
            foreach ($matches[0] as $found){
                if (preg_match('!<a\s+href="/title/tt(\d+)/[^>]*>\s*(.+)\s*</a>!ims',$found,$match)){
                    $title = strip_tags($match[2]);
                    if (preg_match('!<span class="nobr">\s*(.*?)\s*</span>!ims',$found,$all)){
                        $temp = preg_replace('/[^0-9]/','',$all[0]);
                        $type = trim(preg_replace('/[^a-z\s]/i','',strip_tags($all[0])));
                        if(mb_strlen(trim($temp)) >4){
                            $year = trim(substr($temp, 0, 4));
                            $endYear = trim(substr($temp, 4));
                            $this->movierecommendations[] = array('title'=>$title,'imdbid'=>$match[1],'year'=>$year,'endyear'=>$endYear,'type'=>$type);
                        }
                        else{
                            $year = trim($temp);
                            $this->movierecommendations[] = array('title'=>$title,'imdbid'=>$match[1],'year'=>$year,'endyear'=>'','type'=>$type);
                        }
                    }
                    else{
                        $this->movierecommendations[] = array('title'=>$title,'imdbid'=>$match[1],'year'=>'','endyear'=>'','type'=>$type);
                    }
                }
            }
        }
    }
    return $this->movierecommendations;
  }
tboothman commented 5 years ago

The idea of a pull request is you actually change the code, not paste some code into a comment box. You've requested to merge one of my old branches in, and it's got merge conflicts. You need to make sure your fork is up to date with this repo; make a branch for your changes; commit the changes; make a new pull request. This might help https://help.github.com/en/articles/creating-a-pull-request

duck7000 commented 5 years ago

Thanks for explaining, i have a hard time understanding github, mainly due to 2 reasons i imagine: i'm almost 53 and second i'm just a hobby programmer.

So i deleted all my forks and created a new fork from your repo, so now i have a good base to start.

I will try again and the linked article surely gives the info i need, thanks for that.

Just to inform, i used to be a car mechanic for 16 years en trained myself (with a lot of help of a friend programmer) into programming.

tboothman commented 5 years ago

http://makeapullrequest.com/ might be useful too

duck7000 commented 5 years ago

I think i managed to get it right this time?

Thanks for all the info!