tboothman / imdbphp

PHP library for retrieving film and tv information from IMDb
253 stars 84 forks source link

Fixed method Spouse #275

Closed duck7000 closed 2 years ago

duck7000 commented 2 years ago

Fixed this method, it took me about 2 weeks to cover all different kinds of data so i hope it will work in all cases. Added test for this method.

@tboothman can you test this method and the test that i made? It was very, very hard to get al data and separate it in the correct array elements. Re done this method 3 times and i hope it is in a working enough state

duck7000 commented 2 years ago

I must say that this method is also needlessly complicated. Personally i would have get the whole spouse line as is and save it to an array element (will loose imdbid from the person if available)

duck7000 commented 2 years ago

@Thomasdouscha confirmed that this is working

tboothman commented 2 years ago

Not looked at the code yet but the data it exposes is complicated, reducing it down to a string does not fulfil the point of this library which is extracting computer readable data from imdb.

Here's the underlying data for Miyazaki's spouse, which matches the fields that come out of this method so I think it's correct. image

tboothman commented 2 years ago

Looks good 👍

duck7000 commented 2 years ago

Thanks for merging, it was a pain in the back to make it work, lets hope that imdb don't change it.. And i removed as much reg exp as i could, that did make the code much more readable and it should be more robust to changes i think

duck7000 commented 2 years ago

Well yes i agree with your argument that reducing to string is not the point of this library but that can make it difficult to maintain

tboothman commented 2 years ago

Thanks for merging, it was a pain in the back to make it work, lets hope that imdb don't change it..

Hah, yes. Well that's why when I noticed my very loose test of 'it returns something' started failing I opted to just ignore this method. Crazy complicated code

duck7000 commented 2 years ago

Basically the code is complicated because imdb does not separate the data in html tags like span, p or even li. They just dumps it in a td. If i could find where that data is coming from would made my life a lot easier haha so in order to split the data that was a massive undertaking, but i learned a whole lot more so that is worth something.

tboothman commented 2 years ago

Well ... imdb does have a graphql api that has a ton of data in it. https://api.graphql.imdb.com/ I was looking into it last time recommendations broke. You can get all (I think. Looks close) this spouse information from it.

query Person($id: ID!) {
  name(id: $id) {
    nameText {
      text
    }
    spouses {
      current
      attributes {
        text
        language {
          id
          text
        }
      }
      timeRange {
        fromDate {
          dateComponents {
            year
            month
            day
          }
        }
        toDate {
          dateComponents {
            year
            month
            day
          }
        }
      }
      spouse {
        name {
          id
          nameText {
        text
     }
        }
      }
    }
  }
}
{
  "data": {
    "name": {
      "nameText": {
        "text": "Robin Williams"
      },
      "spouses": [
        {
          "current": false,
          "attributes": [
            {
              "text": "his death",
              "language": {
                "id": "en-US",
                "text": "English (United States)"
              }
            }
          ],
          "timeRange": {
            "fromDate": {
              "dateComponents": {
                "year": 2011,
                "month": 10,
                "day": 22
              }
            },
            "toDate": {
              "dateComponents": {
                "year": 2014,
                "month": 8,
                "day": 11
              }
            }
          },
          "spouse": {
            "name": {
              "id": "nm6699367",
              "nameText": {
                "text": "Susan Schneider"
              }
            }
          }
        },
        {
          "current": false,
          "attributes": [
            {
              "text": "divorced",
              "language": {
                "id": "en-US",
                "text": "English (United States)"
              }
            },
            {
              "text": "2 children",
              "language": {
                "id": "en-US",
                "text": "English (United States)"
              }
            }
          ],
          "timeRange": {
            "fromDate": {
              "dateComponents": {
                "year": 1989,
                "month": 4,
                "day": 30
              }
            },
            "toDate": {
              "dateComponents": {
                "year": 2010,
                "month": null,
                "day": null
              }
            }
          },
          "spouse": {
            "name": {
              "id": "nm0931265",
              "nameText": {
                "text": "Marsha Garces Williams"
              }
            }
          }
        },
        {
          "current": false,
          "attributes": [
            {
              "text": "divorced",
              "language": {
                "id": "en-US",
                "text": "English (United States)"
              }
            },
            {
              "text": "1 child",
              "language": {
                "id": "en-US",
                "text": "English (United States)"
              }
            }
          ],
          "timeRange": {
            "fromDate": {
              "dateComponents": {
                "year": 1978,
                "month": 6,
                "day": 4
              }
            },
            "toDate": {
              "dateComponents": {
                "year": 1988,
                "month": 12,
                "day": 6
              }
            }
          },
          "spouse": {
            "name": {
              "id": "nm0892239",
              "nameText": {
                "text": "Valerie Velardi"
              }
            }
          }
        }
      ]
    }
  }
}
duck7000 commented 2 years ago

Looks interesting but i don't have a account so can't see/use it. Is this api free available?