duck7000 / imdbGraphQLPHP

5 stars 0 forks source link

titleCase function in Title.php is not working correctly - characters in middle/end of words are incorrectly uppercased #60

Closed mosource21 closed 1 month ago

mosource21 commented 1 month ago

Not sure on the history of this but isn't it better for imdbGraphQLPHP to return whatever the API returns and leave it up to the developer to decide if they want to process the title data further instead of forcing the use of the titleCase function that irrevocably changes the data?

var_dump(titleCase("Cat's Dog's Hamster's 1st 2nd 3rd 4th"));
string(37) "Cat'S Dog'S Hamster'S 1St 2Nd 3Rd 4Th"
//this is a copy of the titleCase function from Title.php
function titleCase($inputString)
{
    $delimiters = array(" ", "-", ".", "'", "O'", "Mc", "(");
    $exceptions = array("I", "II", "III", "IV", "V", "VI", "GT");
    $string = mb_convert_case($inputString, MB_CASE_TITLE, "UTF-8");
    foreach ($delimiters as $dlnr => $delimiter) {
        $words = explode($delimiter, $string);
        $newwords = array();
        foreach ($words as $wordnr => $word) {
            if (in_array(mb_strtoupper($word, "UTF-8"), $exceptions)) {
                // check exceptions list for any words that should be in upper case
                $word = mb_strtoupper($word, "UTF-8");
            } elseif (in_array(mb_strtolower($word, "UTF-8"), $exceptions)) {
                // check exceptions list for any words that should be in upper case
                $word = mb_strtolower($word, "UTF-8");
            } elseif (!in_array($word, $exceptions)) {
                // convert to uppercase (non-utf8 only)
                $word = ucfirst($word);
            }
            array_push($newwords, $word);
        }
        $string = join($delimiter, $newwords);
   }
   return $string;
}
duck7000 commented 1 month ago

Mm this is obviously not right of course, however this is not a actual imdb title. But yes the titleCase function has is flaws apparently.

The history been that sometimes the case isn't right at imdb so i tried to correct this.

So you are suggesting to remove this on title? I can live with that, no problem.

mosource21 commented 1 month ago

Title and OriginalTitle

For time being I have just edited Title.php and changed to the following as I am have to edit the __construct code anyway to sort out the custom cache.

    protected function titleCase($inputString)
    {
        return $inputString;

[Deleted original reply as forgot $config is not supported - I thought adding a $applyFixesToCase config option would have been a good idea but not sure this would work for everyone]

duck7000 commented 1 month ago

I think you are right that this doesn't belong to this library but on the other side returning data in the wrong case is also not very user friendly.

So i will remove all entrys using titleCase method incl the method itself.

Thanks for testing this though!

mosource21 commented 1 month ago

Okay for title and orginalTitle

Fantastic

I think you are right that this doesn't belong to this library but on the other side returning data in the wrong case is also not very user friendly.

Agree but the solution I think is causing more problems than solving - anything with a 's (quite common at least in English not sure in other languages) is changed to 'S so it is making correct data wrong. This is just one example.

So the original API result is not lost perhaps add titleFixed and originalTitleFixed (similar for other fields) for results using titleCase but then you may as well just remove titleCase and let the developer fix for themselves outside the library?

So i will remove it for those 2 titelCase() is also used for other methods like recommendations and soundtrack, is that a problem too?

I have not had need to use those methods.

duck7000 commented 1 month ago

I removed it all as you have valid points against it, no problem.

Thanks for testing though!