uchicago-computation-workshop / ma_proposal_workshop_a1

0 stars 1 forks source link

Extension: Sawhney, Eliashberg (1996) #28

Open ruixili opened 5 years ago

ruixili commented 5 years ago

Extension: Sawhney, Eliashberg (1996)

Ruixi Li

Review of Sawhney, Eliashberg (1996)

The paper’s title is “A Parsimonious Model for Forecasting Gross Box-Office Revenues of Motion Pictures”. It is written by Mohanbir S. Sawhney and Jehoshua Eliashberg in 1996. This paper prompts a groundbreaking model to make prediction on the gross box-office revenues of motion pictures. Based on queuing theory framework, this paper conceptualizes stochastically the consumer's movie adoption process as two stages, that is “time to decide” and “time to act” (p1; Sawhney, Eliashberg; 1996). The author denotes “time to decide” and “time to act” by constructing γ and λ which jointly follow Gamma distribution based on former literature if dynamic. The authors further introduce the distribution of the cumulative number of adopters by time, N(t), which follows a binominal distribution, with the constructing of constant, N, the size of population of interest. In the empirical testing, the authors use data collected from Variety, a leading trade publication in entertainment industry. In order to test the dynamic model, the data is a weekly box-office performance of a sample movies which includes both successful movies and unsuccessful movies. Through a meta-analysis estimation which includes movie attributes like genre, MPAA rating, major stars, sequel, critic reviews, sexual content, the authors estimate the N, γ, λ. Eventually, the author fit the prediction model to the data with the estimated N, γ, λ which performs well in both static version and dynamic version.

My proposal

The model authors use to stochastically simulate the action of movie adoption is theoretically based and works well. However, in the estimation of three parameters, there are still two places need improvement. First is that, the independent variable is insufficient. Potentially important variables such as the production budgets is omitted in the model, which may cause inaccurate in three parameters. In Quader et al (2017), authors advocate that budget, IMDb votes and number of screens are the most important features which play a vital role while predicting a movie's box-office success. Some studies even consider the impact of theaters on the box office, Wu et al (2018) suggest that provision of special halls, parking convenience, and the number of competitors nearby are more significant factors for the box office of a cinema than pricing. Secondly, when estimating N, γ, λ, the authors use “a standard nonlinear least squares procedure” (p10; Sawhney, Eliashberg; 1996). This process can be more accurate through implementing machine learning techniques like random forest. Therefore, I’d like to extend the research by constructing a prediction model on China’s movie box office. In the moviegoer simulation, I will maintain the stochastic modelling. In the parameter determination, I will estimate N, γ, λ using several machine learning techniques. In order to obtain a higher accuracy, I will train the model using sample splitting. In terms of data, my research includes three data sources, the weekly movie box office data from Maoyan1 and Entgroup2, movie attributes from IMDb3 and Douban4, movie reviews from Douban. For the first one, the Chinese movie box office records is more mature and complete than U.S. due to the existence of oligopoly in which Maoyan is the largest portion-taker. So the weekly movie box office data can be access through Maoyan API. For the second data source, for imported movies and foreign actors, I will use attributes listed in IMDb for language consistency and feature recognition, while for domestic movies, I will use Douban which is the largest movie review platform in China. Lastly, the movie reviews will be collected using web crawler on Douban’s review board.

[1] Tianjin Maoyan Culture Media is a Chinese company that owns the largest online movie ticketing website in China, maoyan.com, with 30% share of the market in 2015.

[2] Entgroup is the leader of big data consultation for China’s Media & Entertainment industry, motivating significant industrial breakthroughs through market intelligence generation, data collection and analysis.

[3] IMDb (Internet Movie Database) is an online database of information related to films, television programs, home videos and video games, and internet streams, including cast, production crew and personnel biographies, plot summaries, trivia, and fan reviews and ratings.

[4] Douban.com, launched on March 6, 2005, is a Chinese social networking service website that allows registered users to record information and create content related to film, books, music, recent events, and activities in Chinese cities.

References:


Y. Wu, W. Huang, Y. Lu and J. Liu, "Box office forecasting for a cinema with movie and cinema attributes," 2018 IEEE 3rd International Conference on Cloud Computing and Big Data Analysis (ICCCBDA), Chengdu, 2018, pp. 385-389. N. Quader, M. O. Gani, D. Chaki and M. H. Ali, "A machine learning approach to predict movie box-office success," 2017 20th International Conference of Computer and Information Technology (ICCIT), Dhaka, 2017, pp. 1-7.