ebeshero / DHClass-Hub

a repository to help introduce and orient students to the GitHub collaboration environment, and to support DH classes.
GNU Affero General Public License v3.0
27 stars 27 forks source link

Solutions for Regex 3: Pygmalion #712

Closed ebeshero closed 4 years ago

ebeshero commented 5 years ago

After today's ruckus in class over speech-tagging in the Pygmalion Homework, where we entertained some pretty complicated regex patterns, I went home and worked out a solution that I liked. I wanted solutions that do not involve very long and complicated regex patterns, because those are usually very brittle--they're hard to explain and reproduce. My solution is pushed here to DHClass-Hub/Solutions/Regex_Ex3_Pygmalion.

I also wanted to show you how to work with capturing groups, so you can see some remixing and also some repeats in my replaces where I take Roman numerals in the Acts and replace them by putting them in attribute values, and inside new element tags. I hope this is useful for you to look at and get some ideas! There's a lot of explanation here, which is hope is easy to read. Feel free to start line-commenting on it. You'll need to add comments on my commit here: https://github.com/ebeshero/DHClass-Hub/commit/1e3d9596b047a4e06b0d63c7336996fa239bad5b

@lewisabia @amberpeddicord @Bennediction @smdunn921 @lmcneil7 @haggis78 @jwa32 @ajw120 @ads171 @mattnowakowski @dylanmore @bobbyfunks @ebeshero @KSD32 @alnopa9

ebeshero commented 5 years ago

@alnopa9 and @KSD32 may want to post their own solutions to the same place. With most regex solutions there are many good approaches, not just one! Any way that makes sense and tags most of what you need is really acceptable, because regex is usually just a starting point to hurry up and turn a big plain text file into XML. We almost always have to fix a few things afterwards when the dust settles.