tboothman / imdbphp

PHP library for retrieving film and tv information from IMDb
253 stars 84 forks source link

[Bug] plot_split not remove <> or {} or () from author name #209

Closed duck7000 closed 3 years ago

duck7000 commented 3 years ago

Description

Plot_split method does not remove <> or {} or () from author name. i know that this is like on imdb website but it will cause problems in template frameworks like smarty.

Movies

examples https://www.imdb.com/title/tt0066921/, last plot author name has <> surrounding name https://www.imdb.com/title/tt0080120/, first plot author name has {} surrounding name

Bug Remove those characters from author name

Expected Results / What do you want to do?

Return author name without those characters.

Actual Results / What is happening?

Author names are returned including those characters.

I came up with the following solution:

    #-----------------------------------------------------[ Full Plot (split) ]---

    /** Get the movie plot(s) - split-up variant
     * @return array array[0..n] of array[string plot,array author] - where author consists of string name and string url
     * @see IMDB page /plotsummary
     */
    public function plot_split()
    {
        $search  = array(' <', '<', '>',' (', '(', ')', ' {', '}');
        $replace  = array(', ', '', '',', ', '', '', ', ', '');
        if (empty($this->split_plot)) {
            if (empty($this->plot_plot)) {
                $this->plot_plot = $this->plot();
            }
            foreach ($this->plot_plot as $plot) {
                if (preg_match('!(?<plot>.*?)\n-\n<a href="(?<author_url>.*?)">(?<author_name>.*?)<\/a>!ims', $plot,
                  $match)) {
                    $authorName = trim(str_replace($search, $replace, $match['author_name']));
                    $this->split_plot[] = array(
                      "plot" => $match['plot'],
                      "author" => array("name" => $authorName, "url" => $match['author_url'])
                    );
                } else {
                    $this->split_plot[] = array("plot" => $plot, "author" => array("name" => '', "url" => ''));
                }
            }
        }
        return $this->split_plot;
    }

It replaces all those characters with , space. So the author names will look like this: authorname, email

Even beter would be to split authorname and email but that will create a problem with backwards compatibility.

duck7000 commented 3 years ago

Closing this after consultation with @jreklund that this doesn't belong in imdbphp.