duck7000 / imdbGraphQLPHP

5 stars 0 forks source link

Imdbid redirect #42

Closed duck7000 closed 2 months ago

duck7000 commented 4 months ago

@All

As this project is growing to a full blown version i consider to bring back logger and or http Exceptions

Is there any need for this?

As usual i don't but other users might so let me know

duck7000 commented 4 months ago

Closing this, nobody seemed to need this.

Feel free to re open or add comments to this issue if you still want this

duck7000 commented 3 months ago

@GeorgeFive

The logger might still be worth a thought, although it only seems to be used for the GraphQL query call. In the case where we did find out that the request wasn't working due the hoster request limit of 1000 characters this would have been handy.

So i might add it, just to be able to check if the query succeed or not. although i test every query in graphql addon in chrome before using it in here

The only other time the logger is used is with the save_photo and photo_local_url which we don't use here.

And logger is in this case misleading, it only outputs html to screen instead of actually log to text file

So what is your thoughts about it? Is it worth to add it anyway?

GeorgeFive commented 3 months ago

I would definitely use it with the output, that would be handy for debugging.... but I really think it would be nice to log things to a text file (maybe a user option in config to log or not?).

One thing I would like to see in a logger is content moved notifications. Example, this title was merged into another title -

https://www.imdb.com/title/tt31841642/

imdbphp works in this case, it correctly detects the redirect and gets the info from the new title, but with the way I have my stuff set up, it doesn't actually notice that there was a redirect because I'm querying by imdb id. So the old imdb id stays, but how long until it becomes a 404...?

Of course, if I actually notice this, I can update things on my end and fix it... but there's a lot of titles to keep up with, not to mention the same case for people merges.

duck7000 commented 3 months ago

Mm i never thought of this, i remember that thomasdousha asked for a redirect function in imdbphp, which i did make for him. So i will have to think about this, it will be something for the future but i will keep it in mind

duck7000 commented 3 months ago

@GeorgeFive i will look in to creating a function to detect if a redirect occurred or not, but i have think/figure out how to do that.

Edit: this function isn't so hard to come up with after all but it doesn't necessarily to be inside a logger function i guess? If a make a function that checks the input id with the return id from imdb, if they are the same, no problem If they are not, return the new id and write something to a log file? or call a logger function to write to a log file? Or does that function have to do more than this?

This is the query i am going to use

{
  title(id: "tt31841642") {
      titleText {
      text
    }
    id
  }
}

In this case the input id is different from id in the query (tt31823836)

For me personally this is not a problem, imdb handles the redirect and returns the data i want but in your case it can be a issue i guess

duck7000 commented 3 months ago

@GeorgeFive something like this?

    #----------------------------------------------------------[ imdbID redirect ]---
    /**
     * Check if imdbid is redirected to another id or not
     * @return New id or false
     * @see IMDB page / (TitlePage)
     */
    public function checkRedirect()
    {
        $query = <<<EOF
query Redirect(\$id: ID!) {
  title(id: \$id) {
    id
  }
}
EOF;
        $data = $this->graphql->query($query, "Redirect", ["id" => "tt$this->imdbID"]);
        $outImdbId = str_replace('tt', '', $data->title->id);
        if ($outImdbId  != $this->imdbID) {
            // todo write to log
            return $outImdbId;
        } else {
            return false;
        }
    }
GeorgeFive commented 3 months ago

Yep, pretty much! I do a lot of behind the scenes automated scanning on data, so even though the existing redirect will work, I won't notice that the imdb id has changed.

So I have id tt31841642 stored along with cached data... automated scan to get fresh data happens, it queries tt31841642, gets the proper data from tt31823836, everything is good. In theory, this shouldn't be an issue. But I worry that eventually that redirect may go away? I don't know how they handle that.

Maybe I'm totally missing it, but maybe a function to get the real imdb id for a given id?

So maybe like...

//redirected id $movie->imdbid("31841642"); //returns 31823836

//valid, not redirected id $movie->imdbid("31823836"); //returns 31823836

duck7000 commented 3 months ago

that is pretty much what above function does, i can alter it to return the not redirected id instead of false?

Or do you want to alter the existing imdbid method to return the redirected id?

GeorgeFive commented 3 months ago

Is there an existing imdbid method? I didn't see one. I know it seems slightly redundant, but I guess for redirects, it would be nice...?

duck7000 commented 3 months ago

Not in the Title class but in mdbBase.php (not sure how to use that though)

In Title Class construct the function setid is used to set the imdbid according it be 7 or 8 digits, and tt or nm. I can add the check in this function, so the set id will always be (in case of a redirect) the right id?

Or (like i did in the above function) add it to Title class?

In imdbphp it is added in Title class

I'm not sure how to proceed from here, it depends on how you want to use it?

GeorgeFive commented 3 months ago

I mainly mentioned the logger function initially because there was no easy way to get the id, and admittedly, I thought it was a bit redundant to add a method "get imdb id based on an imdb id", hah. Same thing with the name class.

But the more I think about it, that would eliminate the need to log these redirects. I could check the id, and if we return the same id as in 99% of the uses, proceed as normal. If it returns a new id, I could use that to update the id in my database, and then return as normal.

duck7000 commented 3 months ago

Above function returns false if id is not changed, so all you have to do is check the output of this function be false. In case false do nothing (keep the existing id) In all other cases the function returns the new id, so use that to update id in your database)

plain and simple i would say

The other option i mentioned is to adjust setid (in mdbBase.php) so that it always returns a possibly redirected id but i'm not sure what a possible side effect this could have?

So the best option is the above function inside Title class and use it like explained here.

I'm not encountering this issue because i use the title search method and so i get the (possibly redirected) id that way

GeorgeFive commented 3 months ago

Just tested that function out, and it worked exactly like I wanted. Could you do the same for person class? Thanks!

duck7000 commented 3 months ago

Thanks for testing

Glad it works like you wanted, i am going to add this to Title class as well as Name class

duck7000 commented 3 months ago

latest git version contains both checkRedirect methods, all info.. you know

duck7000 commented 3 months ago

If the logger is still needed i open a new issue for that

duck7000 commented 2 months ago

@GeorgeFive

I found some more info about duplicate imdb ids Turns out that imdb provide a function to check if duplicate ids exists and returns the preferred id to use

info

Duplicate IDs

IMDb data is constantly being updated, both with the addition of new data and enhancement of the quality of existing data. While there is only ever one unique IMDb identifier, there are, on occasion, instances where there might be duplicate entries for the same entity. This could happen, for instance, if multiple users have contributed data for the same entity (e.g. the same person) under different identifiers (e.g. different name ids). In this case IMDb maintains both identifiers in the data set, effectively duplicating the data. This allows you to continue using any matching you have between IMDb identifiers and other identifiers. To identify when this is the case a remappedTo field is included in the bulk data sets and the title.meta.canonicalId and name.meta.canonicalId field is included in the API. From these fields, you get the new preferred identifier for that entity.

The Big Bang Theory pilot episode has multiple Title ID entries referring to the same episode: tt1044014 (the Title ID that has been remapped) and tt0775431 (the preferred Title ID). When you retrieve either the remappedTo value from the Bulk Data or the title.meta.canonicalId for Title ID tt1044014, you will receive the preferred Title ID tt0775431.

So do you want to change the existing function to use this new info? I think my function does do the same thing but it might be better to use the imdb provided one.

GeorgeFive commented 2 months ago

That one is up to you. The current function seems to work just fine, and honestly, I doubt many people will ever even use it. But, if you do want to change it, I won't argue with that, hah.

duck7000 commented 2 months ago

Well the imdb version is documented and it will return the preferred id which my function sort of lacks. my version returns the different id but there is no way of knowing if this is the preferred one. (most likely it is i guess) And it is a easy change so i will change it. The output will remain the same, false if no changes, or the preferred id

GeorgeFive commented 2 months ago

Sounds good!

duck7000 commented 2 months ago

Already done,latest git version, both Title and Name are changed For you as user there are no changes as the output remains the same. But now we know for sure that the redirected id the preferred one is.

duck7000 commented 2 months ago

closing this one as this seems to work