ebeshero / DHClass-Hub

a repository to help introduce and orient students to the GitHub collaboration environment, and to support DH classes.
GNU Affero General Public License v3.0
27 stars 27 forks source link

regex exercise help: Starting Regex Exercises #355

Closed Jamielynn92 closed 7 years ago

Jamielynn92 commented 7 years ago

So, the explanation you gave in class on how to do this, kind of went of my head and missed my brainpower. As I'm reading on how to do this, I'm scratching my head on how to begin. I tried to set up my oxygen to start it but I couldn't find the 'whitespace'. Or will it only work with the file in oxygen?

ebeshero commented 7 years ago

@Jamielynn92 I'll write up a comment to help get you all started after my 2pm class--I'll be out around 3. Stay tuned!

Jamielynn92 commented 7 years ago

Okay. Thank you. :)

ebeshero commented 7 years ago

@Jamielynn92 @quantum-satire @flowerbee1234 @tal80 @kes213 @jonhoranic @ajnewton1 @ghbondar @gabikeane @zme1 @amk231 @pab124 @jub45 @mof11 @ttb11 @BMR59 @Blangzo

Greetings, everyone! In Greensburg today we had a small series of GitHub crises that cut into our time to introduce regular expressions. As we promised in class, we've added some detailed guidance to help you get started on the first assignment. The assignment page is now updated, and we'd like you to read it carefully and pay special attention to the Step File we're asking you to create as you're working.

When you're working on this regex series of homeworks, we ask you to submit two files:

  1. the more important of these two is the Step File, in which you document each step you take in writing find and replace patterns. We instructors will be duplicating your steps as we review your homework and we'll provide feedback on that file to help fine-tune your work.
  2. the other file, of course, is the Upconverted File: the result of your work in up-converting your plain text to XML. We recommend that you save your Upconverted File as XML (with the .xml extension) at some stage in the process, perhaps even an early one--like, as soon as you've wrapped the document in its root element, then close it and re-open it to see whether it is parsing as XML and where its breaking points are. Then you continue to use find & replace operations in oXygen to improve the code and create the structure you're seeking for the assignment.

You'll upload both files to Courseweb for these exercises. You might save your steps file as downey_StepsRegex1.txt and the new xml as downey_sonnetsRegex1.xml (or something like that, following our usual homework file naming conventions.)

Jamielynn92 commented 7 years ago

since I have a windows, should I use the notepad instead of oxygen for the text, or is it okay to just use oxygen for it?

gabikeane commented 7 years ago

I use notepad ++ as my default editor, but oXygen is fine for plain text. notepad is also okay, but you may find the interface less than desirable.

ebeshero commented 7 years ago

@Jamielynn92 You can use either one! I suggested Notepad b/c sometimes when I'm on Windows and I'm writing up steps for regex like you're doing, and I write the steps in a second window in oXygen, I sometimes accidentally run my regex patterns over my Steps file--which is super annoying! :-0 If you can find a good way to work with both files in oXygen so you don't accidentally run your Find & Replace over the wrong file, that's fine with us! Just, whatever you do, don't write this up in Microsoft Word! Word's autocorrect will eat your regex code alive! That's why you want a simple plain text file for recording those steps.

I wish there were something as good as Notepad available for free on Macs(!) @ajnewton1 may know a good solution that I didn't list for the Mac people. Because I've been doing a lot of regex work over old text documents on my own, I bought myself BBEDIT last year for my Mac and I love it--but it does cost something--about $50.

ebeshero commented 7 years ago

@gabikeane I love Notepad++, too--I remember it had a lot of functionality that was useful for up-conversion from ASCII to Unicode formats, for example. (This may come up in semester projects, but it's not an issue for our little series of regex assignments.)

One of the things I found really mystifying on switching from Windows to a Mac the summer before last was that the available plain text editors were completely different. I started working with Text Wrangler on a Mac and discovered it could do things Notepad++ couldn't do. (One of those things was a built-in algorithm for locating pairs of quotation marks and changing them from straight to curly and vice versa.) Text Wrangler used to be free for Macs--and you can still get it from here, but as that page says, it's not compatible with the latest Mac systems and all of its wonderful functionality has been absorbed into BBEDIT. I think BBEDIT is really the best thing around for Mac people now, but I'm sorry it costs. Having started out as a Windows user, I'm accustomed to my plain text editors being free, thank you very much. Sigh.

Jamielynn92 commented 7 years ago

That's okay. I don't have word on my computer. I use google docs. but I put the plain text in both notepad and oxygen. So I'm still a bit confused.. if I'm able to see you for a one on one tomorrow after 2:15. just to get a better understanding, cause its a lil different then the last couple homeworks.

ebeshero commented 7 years ago

@Jamielynn92 Sure--I'll be in my office (FOB 204) around 2:15 tomorrow, and @ajnewton1 comes in to FOB 131 from 4-5 pm tomorrow (for any of you Greensburg folk seeking help!)

Jamielynn92 commented 7 years ago

Okay thank you. I just need it better explained a lil more. Reading it doesn't help me very much. One on one are always more helpful in person.

Get Outlook for iOShttps://aka.ms/o0ukef


From: Elisa Beshero-Bondar notifications@github.com Sent: Wednesday, September 27, 2017 8:55:26 PM To: ebeshero/DHClass-Hub Cc: Downey, Jamie L; Mention Subject: Re: [ebeshero/DHClass-Hub] regex exercise help: Starting Regex Exercises (#355)

@Jamielynn92https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjamielynn92&data=01%7C01%7Cjld195%40pitt.edu%7C5d43b285d24449940fd108d5060b9c04%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=ZkA7XqD0VoqQvVc%2FgXLUDEau6vyQJFNqlLQmfK1Tdc0%3D&reserved=0 Sure--I'll be in my office (FOB 204) around 2:15 tomorrow, and @ajnewton1https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fajnewton1&data=01%7C01%7Cjld195%40pitt.edu%7C5d43b285d24449940fd108d5060b9c04%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=%2B2taIStKEaNWlVWpfbYYp19cc6X%2FL2AkxFbvrWqJx%2Bw%3D&reserved=0 comes in to FOB 131 from 4-5 pm (for any of you Greensburg folk seeking help!)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Febeshero%2FDHClass-Hub%2Fissues%2F355%23issuecomment-332696463&data=01%7C01%7Cjld195%40pitt.edu%7C5d43b285d24449940fd108d5060b9c04%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=KRmwVACYPoQoPp6uZ5A7CeRFmtOGJBQM4aO7lnrcUi0%3D&reserved=0, or mute the threadhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAeBcSSy7wPFRGpOIe8EisQUm_CPh9Uxhks5smu5-gaJpZM4PmKAv&data=01%7C01%7Cjld195%40pitt.edu%7C5d43b285d24449940fd108d5060b9c04%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=3bRYK%2BDUs74ENW%2BxJe9OEKPa%2FuLaM4JXXCSMNlHNEUU%3D&reserved=0.

ebeshero commented 7 years ago

@Jamielynn92 I remember you were asking how to see the white spaces, and I think you might not have caught how to switch to that view in oXygen (and other people may have missed this too): Here's what you need to do:

In oXygen, look at the top menu bar: go to Options, and then choose Preferences. That will open up the giant Preferences window with a long list of things on the left. On that list, click on Editor, and you'll see this:

screenshot 2017-09-27 21 04 00

Look at the Whitespaces options and mark all of them, like you see in the picture, and click "OK" down at the bottom of that screen.

Then go back to the sonnets file in oXygen and you'll see all its white spaces marked.

ebeshero commented 7 years ago

@Jamielynn92 And you were asking how to begin: It'll help to take a close look at the sample XML solution posted in the assignment (you may need to wait a minute for that XML to load in your web browser). That's what you want to be building out of the plain text sonnets file, and you'll be using the Find & Replace window in oXygen to help you.

What's all this up-conversion about, then? I wrote our tutorial in my own voice (so it's a lot like reading me here, I promise): http://dh.newtfire.org/explainRegex.html Read it to get a sense of what you're doing.

Go back to the file and try starting from the "inside out" like we recommended...and yes, do read this assignment slowly as you go--it's really helpful--every stage of the process is explained. Read it one step at a time and try out the suggestions in oXygen. And go look up patterns as recommended on the links there, and on our newtfire Regex tutorial where we've given you a nice list of regex patterns to experiment with.

ebeshero commented 7 years ago

There's a section at the end of our newtfire regex tutorial that you don't really need to worry about yet--it's looking ahead at how we'll be using regular expressions later in XPath and XSLT--because this is a multi-purpose tool we'll keep on using throughout the course.

Our tutorials are basically designed for two purposes: 1) to orient you when you're learning something new , and 2) to serve as a continuous resource of information that we try to keep up to date--since you're likely to want to come back and Look Stuff Up on them later.

dotfig commented 7 years ago

I like using TextWrangler on a Mac but I am not sure how its functionality is like on a Windows. notepad and notepad++ all do the trick and are pretty quick booting up when working with plain .txt documents.

Also, my office hours are 4-5pm Tues/Thur and @jonhoranic are 2-4:30pm Mon/Wed and we are lonely. Jon usually comes in with me and doesn't stop talking until our next class (not meant as an insult but its very true).

I highly recommend reading the tutorials, which are extremely helpful, and having them open when you do the assignments. Its basically like having a cheat sheet right in front of you.

Jamielynn92 commented 7 years ago

Maybe tues Thursday (days off) I have class til 2:15. but jons hours are during my two history classes. And I go home after cause I usually work 5:30-10:30 mwf

kes213 commented 7 years ago

@ebeshero is there a place to upload our documents to Courseweb yet? I don't see anything under the Upload Assignments Here tab for the Regex exercise.

ebeshero commented 7 years ago

@kes213 Oops--we need to post that! Hang on a minute... (thanks for reminding us!)

jonhoranic commented 7 years ago

@kes213 Heyo! We got a turn-in window up on Courseweb now, sorry for the wait!

ebeshero commented 7 years ago

@gabikeane @ajnewton1 I can't believe I forgot about this in our discussion of text editors, but of course there's the Sublime Text editor which is free and cross-platform compatible (Mac, PC, Linux). We should check this one out: https://www.sublimetext.com/

kes213 commented 7 years ago

@ebeshero @jonhoranic thank you!