EmmaSchwarz / computational-dostoevsky

1 stars 0 forks source link

Project Update #10

Open blueciren opened 4 years ago

blueciren commented 4 years ago

Auto-tagging of speech through the use of Regex is complicated for our text since the paragraphs of dialogues often include the narrator's descriptions, such as Goliadkin said.
Therefore, we began by tagging Goliadkin's letters. Through the use of Regex, we removed the <div> tags with addressee values and replaced them with <letter> in order to avoid confusion with the other div tags for the chapters. Since one letter contains multiple angle quotes, which turned out to indicate letter components and our hero's thoughts, the letter is given the number attributes, so that we could indicate the certain paragraphs not juxtaposed are included in one letter. There were two angle quotes in the downloaded texts that do not have the pairs. When we checked the printed version of the book, the open-angle quote in the line 705 turned out to be a typo in the online version. So we removed it. The line 4280 has the end-angle quote without the open quote. But this one we could not check via google books, so I will check the book in the library tomorrow. Since we figured them out, the auto-tagging process will be easier.

djbpitt commented 4 years ago

@blueciren If you read your posting above, you'll see that your mention of removing the <div> tags doesn’t show the element name. That’s because when you use angle brackets in markdown, the system thinks it’s real markup, so instead off displaying it literally, which is what you want here, it tries to interpret it. To display code snippets, and especially those that contain literal angle brackets, in markdown, surround them with backticks. You can read about this at https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet#code.

You can fix your posting by editing it in place. In the upper right corner of the panel that displays your posting, click on he three dots and then select Edit. If you add the backticks around the tag and then save your changes, you’ll see the result you want.

blueciren commented 4 years ago

Thank you. I forgot about the function of angle brackets in markdown. I've just corrected it!

blueciren commented 4 years ago

Both of the angle quotes without the pairs turned out to be typos, probably created during digitization of the text.

mblynn commented 4 years ago

Hey guys! You seem do be doing really thorough work! you did a great job catching the inconsistency with the angle brackets and cross-referencing with a printed book.