duck7000 / imdbGraphQLPHP

6 stars 0 forks source link

Episodes / Seasons #1

Closed GeorgeFive closed 11 months ago

GeorgeFive commented 1 year ago

Hey there. I've started playing around with some of your code, and so far, so good. Very quick and hopefully easy question for you though... public function episode does not return a season number for each episode. I'm still learning this graphql stuff, so while I'm sure it's available somewhere, I'm not seeing it. Any chance you could add that to the data returned?

Thanks!

duck7000 commented 1 year ago

Well that is a change from my side. In imdbphp episode function gives back season and episode number, but my version gives back season number only at the beginning of a season, not with every episode:

imdbphp: seasonnumber episodenumber airdate etc seasonnumber episodenumber airdate etc

My version: seasonnumber episodenumber airdate etc Next episodenumber airdate etc

The reason i did this because sometimes season and episodes don't line up. With this tv series https://www.imdb.com/title/tt8762206/episodes/ you will see the difference. In imdbphp there will be 4 unknown seasons with each 1 unknown episode In my version there will be one unknown season and 4 unknown episodes (like it should)

So imdbphp returns data in wrong order (it has been like this for as long as i can remember)

My version does require adjustments to your code as the output array has a different structure.

And episodes by year does not work atm, i'm working on that to try to fix that.

Sure you can add seasonnumber to every episode if you want though, the outer array index equals the seasonnumber

duck7000 commented 1 year ago

$season is defined at line 1186

You can add $season to every episode array if you want (you can change order if you like):

$episode = array( 'imdbid' => $imdbId, 'title' => $title, 'airdate' => $airDate, 'plot' => $plot, 'episode' => $epNumber, 'season' => $season, 'image_url' => $imgUrl );

GeorgeFive commented 1 year ago

At first, I tried the obvious $season, but it didn't work. Just tried it again, and there we go... now it works. I guess I must have typo'd it the first time? I have no idea, haha. Anyway, I personally use it solely to verify data before importing it into my own database. If I wanted to get the Creepshow episodes for a specific season only, I can see s1e1, s1e2, s4e1, etc in the returned data. If it shows a blank number for an unknown season (or maybe s0e1 for unknown seasons), no big deal.

Also, I don't see next episodenumber in the code? That would be handy to have!

I actually didn't have to modify anything to get this working with my setup aside from you renamed function episodes to function episode (not plural).

Thanks for the quick feedback, that's a refreshing change!

duck7000 commented 1 year ago

nextepisode was just a example to make clear how my structure looks like

Well if you only use the data from $episode array then yes there is no change with imdbphp

I did rename many more methods, simple names is easier

I'm glad you are willing to try my work! Episodes is/still is very hard to get right, recently imdb changed episodes pages on their site and it looks like they query episodes the same as i did so that looks promising. I did find out how the deal with episodes by year so there is work to be done

"refreshing change" haha now you understand why i made my own version as i'm getting sick and tired from no responses to my issues and PR requests in the past.

GeorgeFive commented 1 year ago

I noticed the new imdb episodes pages went live earlier tonight, and imdbphp was not working with them. That's why I tried out your code, and it's working flawlessly as far as I can tell. I've had to change a few things in my own personal code to make everything compatible, but that's issues on my end, not yours.

The only reason I need to stay with imdbphp as well right now, is for the person data. I believe I remember you saying you had no interest in this, but if you ever change your mind and bring over the basics (I only use name, born/died info, picture, real/nick names), you've have a 100% convert in me!

duck7000 commented 1 year ago

I'm not after everyone to use my version so feel free to use whatever you like. And feel free to use my work to fix imdbphp. I'm not going to make any PR requests to imdbphp myself for obvious reasons.

And for the record i'm not a professional coder, just a hobby project, so i can not guarantee to fix every problem.

Personally i don't use person data in my program (i use the data in my movie program and store it in a database) but i maybe willing to try to convert person class as well but if i do i don't know when or how long it will take (if i succeed that is..) There are more people like you who uses person class so they have to stick to imdbphp, there are 2 ways to fix that. Either somebody uses my work to fix imdbphp or i will have to convert person class as well. For now i'm thinking about it.

GeorgeFive commented 1 year ago

We're in the same boat... just a hobby coder here who has fallen way behind on coding knowledge due to long hours at work for the last few years... hence, why I'm clueless with the graphql stuff, hah.

I honestly wouldn't feel right using your code to fix another project, even with your permission. Honestly, I'm not very happy with imdbphp anyway, as I feel the owner has stopped caring. That's his choice and his right, but I don't want to keep supporting it.... especially with someone else's work.

duck7000 commented 1 year ago

Exactly like my thoughts about imdbphp. Tom took over from izzysoft but didn't have the passion to continue it i guess. I really did tell Tom that if you take over and thus should maintain it then do MAINTAIN it which he actually not is doing in my view. On the other hand it really did surprised me that he took the afford to get going with GraphQL but i'm glad he did. Otherwise i couldn't have done it so far.

I'll appreciate your thoughts about using my work, so maybe it is the time to add back person class in my version even though i probably won't use it myself. Remember though that my title class is stripped from everything i didn't need, others may want them back., will see how that goes

GeorgeFive commented 1 year ago

Actually, your title class has everything I need and then some. I really think that would be the best way to go about it: have a simplified version of the class with basic info (such as your title class and a person class like I mentioned), and then you could have user-maintained classes / files / whatever for niche info. Example, I have no need for person->height, but if someone does, they can add it in and (hopefully) update it if imdb breaks something.

The amount of data that imdbphp can return is amazing, which would be a great thing indeed if there were a team of workers... but that can be incredibly overwhelming for one guy. It would be a lot easier to find the motivation to fix one or two things when imdb updates than to fix 20-25 things that all suddenly broke. This may have been his problem, which I could definitely understand...

duck7000 commented 1 year ago

Oh sure i understand that the amount of work to maintain is overwhelming for one person, maybe he didn't realize this when he started. I think he mentioned sometime in the past that his interest in imdbphp is faded, so it may be time for somebody/us to step up. I will look at person class and if it is doable i will add it back in. But first i'm going to fix episodes by year (although that never occurred to me that i needed that) but it would be nice to have it working.

duck7000 commented 1 year ago

Do you think episodes by year is useful? For now i have a (sort off) check if a tv series is season or year based and if year based return empty array. This is because by year can be a huge amount of data (check one piece 1999, or the bold and the beautiful) witch will exceed the php execution time.

Edit: i made it by year and php execution time is not a problem, through GraphQL it is much faster then parsing html

GeorgeFive commented 1 year ago

I think episodes by year is useful only in the sense that sometimes we need to go that route in order to get the data (ie, your examples and some I have seen). Personally, once I get all of a title's episodes into my own database, I sort it out the way it needs to be (season 1, 2, 3 or season 1999, 2000, 2001), but if imdb does not have it sorted out into seasons, I can't easily get the data to my database.

In a perfect world, I could put in the imdb id for Creepshow and have an array of every episode attached to it as such:

If this episode is in a season: season=1,year=1999 If this is in an unknown season or if it's year based only: season=0,year=1999

duck7000 commented 1 year ago

Yeah i made it just this way, it is season based unless the tv series is so large that year based is a better option.

My episode method sorts out seasons/years and the corresponding episodes as best as i could find. So the output array should be good to work with. Output array could contain unknown, year or season number

I upload the adjusted episode method soon

duck7000 commented 1 year ago

Upload done.

$season is changed to $seasonYear because it can contain season or year

duck7000 commented 1 year ago

I opened a new issue about person class, so we can discuss it there

Thomasdouscha commented 11 months ago
if ($byYear - $bySeason > 4) {
                $data = $seasonsData->title->episodes->displayableYears->edges;
            } else {
                $data = $seasonsData->title->episodes->displayableSeasons->edges;
            }

"if ($byYear - $bySeason > 4)" what does it meaning of it?

duck7000 commented 11 months ago

It does roughly the same as this in imdbphp

 /*
             * There are (sometimes) two select boxes: one per season and one per year.
             * IMDb picks one select to use by default and the other starts with an empty option.
             * The one which starts with a numeric option is the one we need to loop over sometimes the other doesn't work
             * (e.g. a show without seasons might have 100s of episodes in season 1 and its page won't load)
             *
             * default to year based
             */
            $selectId = 'id="byYear"';
            if (preg_match('!<select id="bySeason"(.*?)</select!ims', $page, $matchSeason)) {
                preg_match_all('#<\s*?option\b[^>]*>(.*?)</option\b[^>]*>#s', $matchSeason[1], $matchOptionSeason);
                if (is_numeric(trim($matchOptionSeason[1][0]))) {
                    //season based
                    $selectId = 'id="bySeason"';
                }
            }

I didn't see any other option to make the difference between tv series based on years or seasons. If there are more years then seasons tv series is year based.

I did add this check to prevent one season with thousends of episodes (like one piece 1999 https://www.imdb.com/title/tt0388629)

Thomasdouscha commented 11 months ago

In my oppinion it should like that: Check the base of seasons if there is choose it. Otherwise check years if there is coose it. And you should add parameter like $yearbased=0 . If i use $yearbased=1 it will be yearbased because user has choosen it.

Thomasdouscha commented 11 months ago

I use episodes func too much i had well enough experience about it.

Thomasdouscha commented 11 months ago

firstly it should have optional parameter.

duck7000 commented 11 months ago

In my oppinion it should like that: Check the base of seasons if there is choose it. Otherwise check years if there is coose it. And you should add parameter like $yearbased=0 . If i use $yearbased=1 it will be yearbased because user has choosen it.

Well it isn't as simple as you suggest, every year based tv series has also at least one season so your comparison doesn't work.

And about a parameter adding to the method, i thought about it and have chosen not to do that. But i understand that you don't agree with that? If i do add a parameter then one piece will return 1 season with 1000 episodes unless the user gives year parameter? this is not desirable imdphp did not have a parameter one piece didn't return anything so that is wrong as well

Thomasdouscha commented 11 months ago

How can they give parameter to? It is system application. Admin will decide it. If you use it too much as me . You will also think as me. Optional parameter is necessary. And about this optional of course who can reach the system decide it. Most of time parameter is not necessary. Thatswhy it is optional.

duck7000 commented 11 months ago

I don't understand what you are saying?

duck7000 commented 11 months ago

I can add a parameter to force year based series, that is not a problem. But how to deal with a series like one piece? All episodes will end up in 1 season if parameter is not provided? And for example the bold and the beautiful has more then 7000 episodes who end up in 1 season as well

maybe @GeorgeFive has a opinion about this?

GeorgeFive commented 11 months ago

My use case may be a little different than yours, so I'm not sure how much input I can give. What I do on my end, is get all episodes for a given series and put them in my database with a field for release date and another field for season.... doesn't matter if they're year or season based, the episodes are there. Then, when I fetch episodes for a title in my own app, I check to see how many episodes there are: if there's <= 100 for a given season, I display episodes sorted out by season #. This works for 99% of "normal" TV shows / miniseries out there. If they're all season 1 but there's >100 (such as in your examples), I sort them out by year myself.

Personally, I don't know if there's any "right" way to do it when you consider your examples with thousands of episodes, series with unknown episodes, and all of the other little variables that I'm sure are out there. Maybe an ideal approach would be to have episodes return every episode available for a title with fields for season, release date, and known/unknown status? This way, regardless of how any user uses this data, it's there and they can format / sort / use it as they please.

As far as load... I understand fetching 9000 episodes for Bold and the Beautiful may be crazy, but I would hope most users are using some sort of caching mechanism and not querying imdb every page load, so this shouldn't be a concern.

Thomasdouscha commented 11 months ago

Or simply let you do it two func are seasonbased and yearbased . We use which one is suitable for this tv serial.

duck7000 commented 11 months ago

Thanks for your input @GeorgeFive

I'll think about it

Thomasdouscha commented 11 months ago

There is no exact formula about seasons and years based. Please let us to choose wihich one.

duck7000 commented 11 months ago

I have added $yearbased parameter now.

I still left in the check to deal with series like one piece or bold and the beautiful, or should i remove that check? In that case there is only user choice and if the user chooses season based (default) then one piece will return one season with 1000 episodes. same for b&b unless php execution time gets overridden first (didn't happen jet)

GeorgeFive commented 11 months ago

Personally, just my opinion, I'd say we really don't need the checks. Maybe if there were problems retrieving the data due to timeouts, but you say that's not a problem (I personally haven't tried). These are two extreme cases, and I'm sure there's more examples, but there can't be THAT many out there. I look at this entire project purely as a way to retrieve data, and it's then up to the user to filter and sort it however they see fit. You guys have your own projects, I have mine, and we can all use the data in our own way once we have full access.

Also, it comes back around to data purity. Yeah, One Piece will return 1000 episodes for season 1, but technically, that is correct.... it's a one season show with a ton of episodes. Sorting it out by year would be the optimal way to do it, yes, but if you wanted to do it the "correct" way... there should be at least the option.

In the end, I can work with it as is or with those changes.... just my opinion here, is all.

duck7000 commented 11 months ago

thanks for your comment @GeorgeFive You are right about the checks, with one piece 1 season with 1000 episodes is correct and yes an edge case.

i will remove the checks, it is up to the user how the data is wanted. Personally i only have tv series in my program based on season so it isn't an issue to me

GeorgeFive commented 11 months ago

The only thing I'm curious about, is an example like Bold and Beautiful... they don't actually show a season tab for it on the episodes page, just year and top rated. Even One Piece has a tab for seasons. If I wanted to get all 9000 episodes from that show for some silly reason, would they return properly as all belonging to season 1 given imdb's setup?

duck7000 commented 11 months ago

Mm bold and the beautiful indeed doesn't show seasons, one piece does. I'll try to explain every scenario below

With parameter $yearbased = 1

Everything will be returned by year (including bold and beautiful, one piece and every other tv series)

With parameter $yearbased = 0 (default, so season based)

without the check: One piece will return 1 season, 1000 episodes (this should work) Bold and Beautiful will not return anything as there is no seasons A "normal" season based tv series will return season based with the check: One piece will return year based and per year all episodes (even though parameter is season based) Bold and Beautiful will return year based and per year all episodes (even though parameter is season based) A "normal" season based tv series will return season based

So what is the best approach? i really don't know. But it might be better to leave the check in place, at least the data will be there

GeorgeFive commented 11 months ago

One piece will return year based and per year all episodes (even though parameter is season based) Bold and Beautiful will return year based and per year all episodes (even though parameter is season based) A "normal" season based tv series will return season based

I'm thinking this would probably be best. This conversation is making my head hurt now, haha.

Thomasdouscha commented 11 months ago

I have added $yearbased parameter now.

I still left in the check to deal with series like one piece or bold and the beautiful, or should i remove that check? In that case there is only user choice and if the user chooses season based (default) then one piece will return one season with 1000 episodes. same for b&b unless php execution time gets overridden first (didn't happen jet)

Thatswhy for one piece yearbased will be selected. And i have already changed it some months ago.

Thomasdouscha commented 11 months ago

Mm bold and the beautiful indeed doesn't show seasons, one piece does. I'll try to explain every scenario below

With parameter $yearbased = 1

Everything will be returned by year (including bold and beautiful, one piece and every other tv series)

With parameter $yearbased = 0 (default, so season based)

without the check: One piece will return 1 season, 1000 episodes (this should work) Bold and Beautiful will not return anything as there is no seasons A "normal" season based tv series will return season based with the check: One piece will return year based and per year all episodes (even though parameter is season based) Bold and Beautiful will return year based and per year all episodes (even though parameter is season based) A "normal" season based tv series will return season based

So what is the best approach? i really don't know. But it might be better to leave the check in place, at least the data will be there

Method is well enough. Seasonsbased default and if i use parameter yearsbased. perfect ! As i told you there is no exact formula for this situation. Thatswhy nobody can fşnd out solution just by code. We need an optional parameter because it is very rarely some situations and so sometimes it is up to us!

duck7000 commented 11 months ago

@GeorgeFive @Thomasdouscha Thanks for your input!

And yes my eyes and brain hurt too after writing all of this haha

It will remain like it is now, best of both worlds and the user has the possibility to make a choice

Thomasdouscha commented 11 months ago

I will test it in weekend

Thomasdouscha commented 11 months ago

For this: $arr = $imdb->episodes(); i had error:

message | "Call to undefined method Imdb\\Title::episodes()" -- | -- exception | "Symfony\\Component\\Debug\\Exception\\FatalThrowableError" file | "/mnt/diskssd/web/site_copy/app/Http/Controllers/Movie/MovieController.php" line | 7046 trace | [ {…}, {…}, {…}, {…}, {…}, {…}, {…}, {…}, {…}, {…}, … ] 0 | Object { file: "/mnt/diskssd/web/site_copy/app/Http/Controllers/Movie/MovieController.php", line: 6883, function: "UpdateImdb", … }
Thomasdouscha commented 11 months ago

in your code episode() imdbphp de episodes() i add that "s" now it is working well enough for seasonbased. I did not try yearbased yet

Thomasdouscha commented 11 months ago

For seasonbased i did not have any issues. It does work well.

duck7000 commented 11 months ago

well that was a very easy fix.. And yes i changed the names of many methods, so be aware

Thomasdouscha commented 11 months ago

yearbased also works but for one piece gives error:

message | "A non-numeric value encountered" -- | -- exception | "ErrorException" file | "/mnt/diskssd/web/site_copy/vendor/imdbphp/imdbphp/src/Imdb/Title.php" line | 2208 trace | [ {…}, {…}, {…}, {…}, {…}, {…}, {…}, {…}, {…}, {…}, … ] 0 | Object { file: "/mnt/diskssd/web/site_copy/vendor/imdbphp/imdbphp/src/Imdb/Title.php", line: 2208, function: "handleError", … } // Unknown episodes get a number to keep them seperate. if ($epNumber == "unknown") { $epNumber = ucwords($epNumber) . '_' . $keyEp + 1; } here is this line= epNumber = ucwords($epNumber) . '_' . $keyEp + 1; there is an issue here
Thomasdouscha commented 11 months ago

For the simpsons works very well yearbased approx. 800 episodes. And works seasonbased too

Thomasdouscha commented 11 months ago

I have changed this : $epNumber = ucwords($epNumber) . '' . $keyEp + 1; to $epNumber=0; And One Piece also works yearbased.

Thomasdouscha commented 11 months ago

3 episodes of episode value are equal to 0. I have checked them and compared with myanimelist.net and as a result i can say that these tree episodes are wrong should be deleted. it is mistake of imdb.

and you should fixed this: $epNumber = ucwords($epNumber) . '' . $keyEp + 1; it must be integer. my suggestion for you if it is "unknown" ignore it.

The and of first test. your code seems that ok well done! bravo

Thomasdouscha commented 11 months ago

I have just one objection!

duck7000 commented 11 months ago

Well that is a problem in your code. Adjust your code to deal with non numeric values?

I give back episodes which can be a number(including 0), or unknown_number as there can be unknown episodes as well. You just say ignore all those? that doesn't seem right. (one piece is just one example, i have another one creepshow 2019 it has unkown season as well as 4 unknown episodes)

If it "must be integer" how do i deal with unknown episodes then? I can't give them all 0 (even if this is a mistake of imdb or not) The only thing i can think of is give unknown episodes a negative number (-1, -2, -3 etc)?

GeorgeFive commented 11 months ago

Two of those Creepshow episodes were actually specials on Shudder, and they were also added to the DVD releases (and they were really good!). Point is, we really don't want to just ignore stuff like that, as there is definitely value there.

Do the internal numbers have to be unique? Maybe we could just put episode number 0 for all of them. After all, they will still carry the imdb id with them to distinguish between them. Just an idea... I handle them my own way, so as long as I can get the data somehow, I'm happy.

Thomasdouscha commented 11 months ago

Well that is a problem in your code. Adjust your code to deal with non numeric values?

I give back episodes which can be a number(including 0), or unknown_number as there can be unknown episodes as well. You just say ignore all those? that doesn't seem right. (one piece is just one example, i have another one creepshow 2019 it has unkown season as well as 4 unknown episodes)

If it "must be integer" how do i deal with unknown episodes then? I can't give them all 0 (even if this is a mistake of imdb or not) The only thing i can think of is give unknown episodes a negative number (-1, -2, -3 etc)?

episode number cannot be unknown or null. It should have exact value. I think it is not necessary to explain why? I made it 0 because i want to see what is it. And as a result these are wrong data. I deleted them.