Closed hyunjimoon closed 8 months ago
less imminent notes
'Nico' Arcuri, Dominic Our Final Slumber (2016)
Stuck in the Middle (2016) [Happy Man]
'O'Connell, Colette Hello Au Revoir (2018) [Letty O]
- `Finbar 'Finchie''Coveney ` becomes `Finbar 'Coveney` after processing
(-> treat them differently as it could cancel out more old?)
Compared to old data which shows at least a hundred person-title for Suits (2011), new data (movie-principals) include only 5 rows (for entire, not only season)
issues
slightly different names
y and i
abbreviation
Dominique A.
andDominique Abel
new is smaller than imdb online
full credit info which is in online (viva, 2001, tvseries 1 episode) is not included in new.tsv
documentary's category is not actor or actress, but
self
i.e.Scrooge .
fromCourier Culture
(order =1, but far from star)-- should i include all category?
array(['self', 'director', 'cinematographer', 'composer', 'producer', 'editor', 'actor', 'actress', 'writer', 'production_designer', 'archive_footage', 'archive_sound']
statistics
old: 16m (15870224) new: 20m (20517830) oldnew_left_merge: 16m (15876865; [can increase if right has duplicate [title_year, primaryName] row](f the right table has two records that match to one record in the left table, it will return two records.)) oldnew_inner_merge: 2.5m
Q. for new, isn't 1:3 for title: title-person too small? (can ~10 casts be small enough, can 1 cast e.g. documentary be large enough, to explain this?)