datadesk / star-wars-analysis

A Los Angeles Times analysis of the dialogue spoken in Star Wars episodes 1-8
https://www.latimes.com/projects/star-wars-movies-female-character-analysis/
GNU Lesser General Public License v3.0
12 stars 3 forks source link
csv data-journalism gender-gap journalism movies news star-wars

star-wars-analysis

The Los Angeles Times counted the words spoken by male and female characters in the first eight episodes of the "Star Wars" film series ahead of the release of "Star Wars: The Rise of Skywalker." The analysis found despite the fact that the latest series of films contains greater gender diversity, male characters still have the most dialogue. The data also allowed us to rank the series' most talkative characters. More details on the project are described in this Q&A.

Methodology

The Times analyzed the movie scripts of the eight entries (so far) in the “Star Wars” saga. The analysis included all English dialogue (Galactic Basic in the films’ universe) that appears in subtitles of the films streaming on Disney+ and Netflix, as well as any fictional languages translated into English.

To count Chewbacca’s words, The Times performed a rough translation of the Wookiee language of Shyriiwook, counting individual growls, moans and roars as one word each. Similarly, R2-D2 and BB-8's individual whistles, beeps, boops and blurts were counted as a single word each, with help from a computer program. Other speakers of alien and droid languages, including Teedospeak, Jawaese and Ewokese, could not be reliably translated without a protocol droid. Lines delivered in such languages were logged instead as one word to establish a minimum presence for the character.

The analysis of dialogue by gender was limited to the first 15 characters listed in the credits of each movie. Humanoid characters that appeared in the credits but spoke no lines in the movie were included in the gender comparison. These characters include Gov. Tarkin and Queen of Naboo in “Revenge of the Sith” and Lando’s Assistant in “The Empire Strikes Back.” Characters that did not speak English or have English subtitles were not included. These characters include Chewbacca, R2-D2, Chief Jawa, Chief Ugnaught and Snow Creature. Characters represented by more than one actor in the film credits were combined for the purpose of this analysis.

Words heard in characters’ dreams or flashbacks did not count, although dialogue delivered via recorded message or conveyed by Force ghost did.

Anakin Skywalker and Darth Vader’s dialogue was counted separately for the sake of documenting the conflicted character’s relationship with others. According to the series’ timeline and for the purposes of the Times analysis, Anakin became Darth Vader when Emperor Palpatine says, “Henceforth, you shall be known as Darth Vader.” The character became Anakin Skywalker again after he threw his master down a shaft on the second Death Star.

Characters in disguise or impersonating another character were counted as the actual speaker. By this rule, Darth Sidious and Palpatine were considered the same character in the prequels, as were Princess Leia Organa and Boushh, the bounty hunter she disguised herself as, in “The Return of the Jedi.” In several scenes of the “The Phantom Menace,” a handmaiden called Sabé acts as a decoy for Queen Padmé Amidala. Using visual cues on-screen— and from other sources, lines delivered by Sabé were counted separately from those said by Padmé.