sud03r / hackaton

A private repo for our hackathon
0 stars 0 forks source link

Missing imdb data #2

Closed szepi1991 closed 9 years ago

szepi1991 commented 9 years ago

For a number of movies the imdb data is missing. Investigate why, and/or add it.

sud03r commented 9 years ago

can you give an example where the data is there on imdb but not being shown. For the examples i tried, the data is not there on IMDB?

szepi92 commented 9 years ago

Frailty -- no data on medley but there's an imdb page for it

http://www.imdb.com/title/tt0264616/

sud03r commented 9 years ago

ok, I fixed the frailty bug. It was because the json entry had (nested quotes) like, "a"b"c" in it and json_decode failed. The fix was to escape the inner ".." in the database (sadly no fix that works for all cases). We will need to fix all the data (look for all nested quotes and escape them. I will later write a script that does that (coz its a bit non-trivial)

szepi1991 commented 9 years ago

Good work guys, it's a pleasure to see this discussion :)

@sud03r, the other thing is if there's missing values, could you keep them as NULL in the database? (if you'd like an explanation why, let me know; I just don't wanna give it here for sake of time in case you already see that's the better solution)

sud03r commented 9 years ago

I have fixed it somewhat. The new database has few missing data, around 205 (and not much could be done about them). The earlier number was 678.

For testing repopulate the database from the sql file: db/database-input-new.sql

sud03r commented 9 years ago

ok, i also fixed the remaining 63 movies for which we had data but it was malformed.