rlCodev / data-science-text-analytics

Text analytics research project for classifying music tracks into genres by analysing their lyrics.
3 stars 0 forks source link

Age Restriction Analysis of Movie Scripts

The goal of our project is to automate and extend the age rating of movies based on the script of the movie. The criteria for our evaluation will be primarily based on the official rating rules used by the Motion Picture Association.

1. General information

Team Members:

Utilized libraries

Contributions

See "Project log" section.

2. Project State

Planning State

One of our high level milestones for November could not be achivet yet. Find a detailed list of goals achived and not achived.

Achived Goals

Currently, the project is behind schedule with respect to the initial milestone plan. This should be made up by the lecture break from 22.12.22 to 07.01.23. One reason for the delay is the change in data sourcing. The operator of a platform for film scripts had unexpectedly stopped responding.

High-level Architecture Description

High level application architecture

High level processing architecture

todo: add descriptions to preprocessing steps and knowledge from data analysis

Experiments

First experiment to find the official age ratings of movies through a TMDB API. For this we wrote a python script and used a selection of movie titles to get the age ratings. Results can be found here

Data analysis, explocation and description. Results can be found in next section.

3. Data Analysis

This section can be found in this jupyter-notebook:

Project log

Davit

data_exploration

Data exploration notebook

data_gathering

data_preprocessing

elasticsearch

fastapi

own_model

react_ageflix

severity_model