Note: This is one of several issues related to basic information retrieval from the syllabi. We are assuming in all cases that the extraction is from a .txt document.
Task: Given a syllabus in .txt format, identify and extract the books, articles, or other citations that are mentioned.
Example: Given the following :
"Required Texts
Goethe, Elective Affinities, trans. Constantine (Oxford)
Stendhal, The Red and the Black, trans. Gard (Penguin)
Dostoievsky, Crime and Punishment, trans. Pevear and Volkhonsky (Vintage)
Flaubert, Sentimental Education, trans. Baldick (Penguin)
Tolstoy, Anna Karenina, trans. Pevear and Volkhonsky (Penguin)
Mann, Buddenbrooks, trans. Woods (Vintage)"
Note: This is one of several issues related to basic information retrieval from the syllabi. We are assuming in all cases that the extraction is from a .txt document.
Task: Given a syllabus in .txt format, identify and extract the books, articles, or other citations that are mentioned.
Example: Given the following :
"Required Texts Goethe, Elective Affinities, trans. Constantine (Oxford) Stendhal, The Red and the Black, trans. Gard (Penguin) Dostoievsky, Crime and Punishment, trans. Pevear and Volkhonsky (Vintage) Flaubert, Sentimental Education, trans. Baldick (Penguin) Tolstoy, Anna Karenina, trans. Pevear and Volkhonsky (Penguin) Mann, Buddenbrooks, trans. Woods (Vintage)"
Extract the following:
citation1 = { "authorFirstName": None, "authorLastName": "Goethe", "type": "Book", "title": "Elective Affinities", "publisher": "Oxford", "translator": "Constatine" "publicationDate": None }
etc.