timjzee / frankenstein-v2

0 stars 0 forks source link

frankenstein-v2

Acknowledgements and License information

About

The main objective of this project is to provide an accessible gold standard text for the authorship attribution of the 19th century novel Frankenstein. The text is constructed from the hand annotations of the draft in the Shelley-Godwin Archive (SGA). The TEI .xml files used in the SGA are multifunctional and as such they are not optimized for any single purpose. The files provided in this repository have been constructed with authorship attribution in mind. Some features:

As a secondary objective this project presents an initial comparison between a stylometric analysis of Frankenstein based on other work by Mary and Percy Shelley and the gold standard hand annotation. The analysis will consist of the following:

The tertiary objective of this project relies on the outcome of the rolling classification of Frankenstein. Initial tests show that a rolling classification with sample size of 1000 words and an overlap of 900 identifies an authorial shift to Percy towards the end of the novel. This is in line with the hand attribution by Charles Robinson. Interestingly, the rolling classification does not identify this change at a sample size of 5000 words and an overlap of 4500. This suggests that larger sample size may not always be better in authorship attribution of collaborative texts due to a decrease in resolution at larger sample sizes. In other words, smaller sample sizes may be used to increase resolution at the cost of accuracy.

Notes on composition

The presented text has been composed so as to resemble the 1818 edition of the novel while maintaining insight in the contribution of Percy Shelley. As such, the text is taken from the 1816-1817 draft up until the last few pages of Chapter 18. From that point onwards the text has been taken from the Fair Copy so that Percy's contributions to those final pages are reflected in the final text. As Robinson (2008, p. 29) notes:

As we move from the extant 1816-1817 Draft to the first edition of 1818, we note the following differences: minor changes that Mary Shelley made to the Draft when she fair-copied it; some substantial changes that Percy Shelley made to the Draft when he wrote out the last twelve-and-three-quarter pages of the Fair Copy;

Furthermore, as Robinson notes (2008, p. 41), the following sections are missing from the 1816-1817 draft:

from Volume I, the four introductory letters from Walton to his sister Margaret and the first part of Chapter 1; and from Volume II almost half of Chapter 3 and all of Chapter 4.

I have chosen not to replicate these sections from the 1818 version as we do not know who wrote them.

To do (crucial items in bold):