ebeshero / DHClass-Hub

a repository to help introduce and orient students to the GitHub collaboration environment, and to support DH classes.
GNU Affero General Public License v3.0
27 stars 27 forks source link

Semester Project Proposals #337

Closed ebeshero closed 6 years ago

ebeshero commented 7 years ago

Post a tentative, exploratory proposal here for a team semester project: What ideas do you have for an interesting XML-based semester project to work on in a small team?

Pitch your idea to your classmates and be sure to give us enough background so we understand your topic.

Each of you should post a project proposal here, and mark it at the top with a bold header tag (like with two ## in markdown like so:

## PITTSBURGH

or

## GREENSBURG

so we can tell proposals from our two campuses apart.

Please respond to each other to express interest and raise ideas, or perhaps make connections with your own project proposal. If conversations become particularly intense, we can start new issues on particular proposal ideas (to help organize the discussions). This online discussion serves as the basis for forming project teams: We'll form teams next Monday once we've all had a chance to read and respond and modify/develop the ideas together!

"Repurposing" or continued development of existing projects is welcome, as well as proposing new projects. Repurposing work should move an existing project in a new direction. You can review past Pitt student projects from Greensburg campus on newtFire: http://newtfire.org/dh/studentProjects.html, and from Pittsburgh campus on Obdurodon: http://dh.obdurodon.org/course-projects.xhtml

initial proposal posts due by F 9-15 Follow-up conversation continues over the weekend @Jamielynn92 @tal80 @quantum-satire @kes213 @flowerbee1234 @ajnewton1 @jonhoranic @gabikeane @zme1 @amk231 @ghbondar @pab124 @jub45 @mof11 @ttb11 @BMR59 @Blangzo

flowerbee1234 commented 7 years ago

GREENSBURG

This might be a stretch, and if I'm not allowed to do this, I totally understand. But I was thinking that maybe I could post excerpts of my fictional pieces and just code those. This way, I would analyze my own writing style and figure out what needs more work. I think this could be a good way to gain recognition as a writer and as a web designer. I like writing short stories and poetry, so I could take snippets of those and code them. I could also look and see which writers my style compares to--and code their work, as well.

helvitiis commented 7 years ago

GREENSBURG

I'd like to start out by stating that I am a huge science nerd and that I wanted to bring a slightly different proposal topic to the table than what may have been done in the past. One of my favourite people of all time is Nikola Tesla--whom you may have heard of before, but is generally forgotten in history books, science books, and in classrooms throughout the US. If you aren't familiar with who Tesla is, I wrote about him here on a project I've been working on, it's a short little blurb and is a quick read. I also recommend The Oatmeal's comic/infographic about him if you prefer visual and humorous presentations of material. I'm including the links so that this post isn't even longer than it already is. Please check them out if you don't know much about him!!

My initial idea for this project was to scour the internet for texts, journalistic publications, and newspaper articles about Nikola Tesla versus Thomas Edison in the AC/DC "War of Currents" (AC stands for Alternating Current and DC for Direct Current). We would then read through each piece of text and mark up locations cited to create a map of the hotspots where electricity hosted its earliest years, and to see who was mentioned the most for the public eye. I'd also like to take a look at the major differences between the two currents, marking up definitions of the original patents or published articles explaining what each does and how they work. However, I can understand this part may not be the paramount of excitement for everyone, so I'm also proposing the possibility of logging which person was mentioned more times in newspapers and journals, what the overall mood of that piece was (was it shedding a positive or negative light on the person in question; what adjectives were being used to describe that person; was the piece propaganda, etc.) and other creative ways to show, in the end, who was more popular for which reasons. We can read a great number of books about these two, but being able to show their differences is what will really make this project unique!

For you history majors that might consider working on this project, this is an awesome time period to study. I bet you'll enjoy reading through the texts even if you aren't a fan of anything science-y. And hey, to throw some history humor into it, Drunk History even made an episode about these two. For lit/English majors, science is probably not your most favourite thing in the world, but consider this an ambitious branch out from your comfort zone--it can prove to future employers that you can handle coding and understand markup language pertinent to subjects that many English majors don't usually work with--this will definitely make you stand out! Regardless of what your field of interest is, I truly believe that bringing this project to life would be a lot of fun, and everyone can learn something about one of the greatest geeks that ever lived.

helvitiis commented 7 years ago

@flowerbee1234 I actually think this is a really neat idea. I've heard about universities creating algorithms to characterize the writing styles of several popular authors, and seeing how your writing compares to the writing of other writers would be really neat! I think you'd have to be alright with posting your work in a public space where anyone could read it; you'd also have to think, in terms of group work, if anyone else would want to mark up your fiction pieces, or if others would contribute their own work using the same markup process. What if you chose a handful of specific (preferably famous?) authors and compared their writing with each other and then with yours, attempting to get a percentage match to see whose writing your pieces are most similar to? You could also do the same with each of the authors, comparing them side-by-side with each other. That way you could get a broad spectrum of different types of writing, it could appeal to a wider audience, and you'd have the recognition of being a web designer and a writer! Definitely expand on this idea, I think it could be really cool!

Jamielynn92 commented 7 years ago

Greensburg

Okay, so far, there's an English option and science option. And I think doing something on Tesla would be neat since I don't know much about him. But since I'm more of a history buff and the Civil War and Abraham Lincoln are my favorite in History, Id like to be able to put the battles and research all the telegraphs and the battle strategies of how each side won each battle. Maybe compare the different ages of males that died on each side (if there's much record since it was back in 1861-1865) but it would be interesting. And since I plan on being a curator, I think learning how to archive code historical documents and soon learn to do art work or something. But that would be my interest, although @quantum-satire idea for Tesla would be neat. Which I've taken a look at your project, and its intriguing.

helvitiis commented 7 years ago

@Jamielynn92 I had a feeling you were going to choose this topic! It would be a huge commitment time-wise to log all of the telegraphs or all of the men who died, because you would have to add attributes to ages and maybe to mark the ethnicity and their place of origin. If you wanted to add markup to documents relating to this time period, you could keep your focus on Lincoln and his papers, perhaps referring to things he felt passionate about, and logging the tone and parts of speech he was more apt to use. If you wanted to do something less specific and choose the broader topic of the war itself, I would try to keep your focus on some particular aspect of the war such as perhaps how many men and boys left from each town (and to make a chart of it), what the ethnicities of the soldiers fighting were, or like you stated earlier to compare the strategies of the two sides and log the percentage of effectiveness for each battle. This is really, really neat--and it would appeal to a lot of people!

kes213 commented 7 years ago

GREENSBURG

My proposal for a semester project is the Harlem Renaissance Poetry Project (working title).

As an English/Creative and Professional Writing major, I’m extremely interested in literature. The Harlem Renaissance is a period in United States history that fascinates me. From 1919 to around 1940, the Great Migration started forcing blacks to move and live in Harlem (an area of New York City). Thousands of African Americans lived there, and black culture thrived without fear of judgement. The literature that came out of this period -- work from poets like Langston Hughes, Claude McKay, and Countee Cullen, for example -- was raw and honest. It gave a truthful insight as to what it was like to be an African American living in the United States at this time, pre-Civil Rights Movement.

For more information on the Harlem Renaissance poets themselves, check these sites out: Concordia University Poets

If the class puts together a team to work on this project, we would be using XML to mark up poems from Harlem Renaissance poems such as the ones listed in the website above, among others. We could decide as a team which 4-6 poets we would want to focus on. We could find the poets on a lot of different websites, but Poetry Foundation is a great source for them. Claude McKay’s poem "The Lynching" is a great example of a typical Harlem Renaissance poem that you could look at.

With this project, I would like to investigate the patterns that exist in poems written and published during this era. I’d like to look at how often figurative language is used, as well as the frequency that each type of figurative language appears across poems. I’d also like to track the punctuation used by each poet across his or her work; punctuation really affects the way the poem is read -- it can speed a poem up to create urgency, or slow a poem down to create suspense. Looking at trends in punctuation of Harlem Renaissance poetry could reveal a lot about what these poets wanted to say about their culture and live experiences during this time in American history.

Mark up for these poems would include a header and a body. The header would typically be divided into the title and byline, with the date added as an attribute. The body would be the content of the poem, and each line would be its own element “line,” numbered by the attributes “n” and “stanza.” From there, we would need to make “element soup” to label punctuation marks and phrases that use figurative language. The figurative language that we would focus on could be determined by the team, but some examples would be metaphors, similes, personification, and alliteration. We could also explore assonance, consonance, and imagery if the team has an interest in doing so. Punctuation could be anything from periods to commas, exclamation points, question marks, colons and semicolons, dashes, parenthesis, etc. Each mark of punctuation would be labeled with the same element (“punc” for example) and then the name of the mark could be identified in an attribute.

Of course, I would be open to exploring other aspects and patterns in these Harlem Renaissance poems!

Jamielynn92 commented 7 years ago

@quantum-satire you have a point, to do all that would be a lot. lol

ebeshero commented 7 years ago

@kes213 The Harlem Renaissance is so connected to the Jazz Age that there would be some wonderful musical and urban contexts to explore with your topic! Locating good sources of texts is an important first stage--and I think you meant to include some links in your proposal but they aren't coming through here--Can you go back and add those in markdown?

Jamielynn92 commented 7 years ago

@ebeshero so for some reason my terminal is trying to merge, how do I reverse that?

ebeshero commented 7 years ago

@Jamielynn92 It's not a bad thing--but you'll want to let the merge happen. Also, let's handle this in a new issue outside the Project Proposals thread.

dotfig commented 7 years ago

GREENSBURG

I think it'll be interesting to do a project on something that we all contribute to. For example, a project that is designated to GitHub Issues and how we use it. We can see what gets the most attention (Does XML have more posts than Relax NG/ Does anyone post anything about Javascript?). How many comments are average per post (Are we expected to see around 6 comments per post/What posts get the most comments?). We can mark down language(adj., verbs, nouns, etc.) to see how people are reacting to different topics. The list can go on and on but I think it could be very interesting to look at and to see what the final results are. I know so far there are around 300 Issues so I think it would be asking a lot to look at 300. Maybe we can look at 50 this semester and then it could be an ongoing project for semester to come. I'm open to suggestions but I think doing a project on GitHub would be very interesting. What do you guys think?

kes213 commented 7 years ago

@ebeshero I fixed the links in my proposal! They're working on my local computer now, so hopefully they work on everyone else's end too!

pab124 commented 7 years ago

Pittsburgh

I think that an interesting project for the term could be to analyze the 17 diary entries of Bobby Sands that have been made available. (This is a very brief and moderately low value introduction to Sands) Sands was a political activist in Ireland during The Troubles and was held and treated as a common prisoner while he and many others believed that he should be given the status of a political prisoner. He gained recognition when he began a 66 day hunger strike to achieve awareness that someone in his position should be a political prisoner. During his time in prison, he wrote in a diary and 17 entries have been made available. As creepy and intrusive as it may seem, I've always thought that private letters and diary entries were some of the most fascinating articles of literature because of their private nature; form and function can be disregarded and the content may often be unhindered by the idea of a third party's observance. I think that it could be interesting to analyze the entries for the structure, content, and relevance, especially given that the entries begin on the first day of his hunger strike. One would undoubtedly think that someone's writing and thought process may alter over time, especially when deprived of food for such a period of time, so there may be reason to believe that there are aspects of the entries that a markup could produce or recognize. His diary entries may be found at http://www.bobbysandstrust.com/writings/prison-diary as well as a quick google search for "bobby sands prison diary" FYI @nlottig94

ttb11 commented 7 years ago

Pittsburgh

I'm currently taking a class in the Ancient Epics like Homer's 'Iliad' and 'Odyssey.' In class we briefly touched upon the repetitiveness of the poem's syntax and form, because in Homer's time, there was at best a rudimentary writing system but definitely not one used for literature. Therefore, it was an oral story and a 600 page epic poem is a lot easier to memorize if you have a system of how the form of the literature is. Being some of the most famous pieces of literature ever written, it would be interesting to research them in a way I'm not sure many have done before in the thousands of years of their existence. In the markup we would probably analyze sentence structure and the language used in the sentences. However, they are extremely long poems, so we wouldn't be able to do even one poem, but nevertheless it would be interesting to see the syntax in the first few books of one of the poems. If you'd like to see what the Iliad is like, you can find it at http://www.perseus.tufts.edu/hopper/text?doc=Perseus:text:1999.01.0134:book=1:card=1. Some translations are better than others, which would be a problem in finding a uniform copy for all of us.

ebeshero commented 7 years ago

@ttb11 Actually you can indeed mark up a whole gigantic text like an epic. You were commenting on the Decameron project the other day (and helped us realize we needed to fix a server setting for its navigation bar to show--thank you!)--but that project is a nice example of taking on a gigantic, complexly layered prose text to completely render it in XML. You can then work with that XML structure to study something interesting, traceable through the text. How do you do that? Well, all those repetitive structures in long poems follow predictable patterns, and you can autotag these using regular expression matching to find all the structural divisions and wrap them in tags to generate your XML hierarchy. We'll be teaching you how to do that in a couple of weeks! For now, know that it's possible to do, and that technology has permitted people in our coding classes to take on huge multi-volume voyage logs, Dante's Inferno, and other enormous texts. One of the most helpful things the Greensburg Decameron team accomplished was modeling the structure of that enormous text on a single screen--based on their finding those structural patterns: see http://decameron.newtfire.org/boxModel.html

The trick with coding these and building a semester project is coming up with a research question and markup strategy that helps you with "distant reading" or finding patterns that you can locate and study with a computer more easily than with the human eye. In our Pacific voyages project we tracked latitude and longitude coordinates in some gigantic voyage logs because those followed predictable patterns: you could find everything marked with degrees, minutes, and seconds, and then we had to figure out how to distinguish latitudes from longitude readings as these were recorded by 18th-century navigators. The difficulty with long texts in projects, though, is being clear on your research question and working out a pattern of something to study.

jub45 commented 7 years ago

Pittsburgh

I am looking to create a highly accessible 'on-the-fly' project for performing multiple translations in various different agglutinative languages (Turkish, Azerbaijani and Hungarian). Given the nature of these languages (and their notorious difficulty among English speakers), my objective is to create reference and glossing materials with which anyone - no matter their experience level - be able to readily understand the grammar, prosody and conventions of these wonderful languages. All three of them are famous for their poetry and ancient Epics, and they also offer an insight into how languages borrow words from one another via various language contact scenarios over the last 1200 years. It may sound daunting but I insist that it MUST be possible do do all of this and much more! Earlier, I posted a proposal about using Turkish as a means to translate the near-endangered Urum language (also a Turkic language), and if I play my cards correctly, I can also add on to this project with Urum as well. Obscure languages are my forte, and now all I need is a 'pen' with which to teach others about them.

BMR59 commented 7 years ago

Pittsburgh

I'm actually having some trouble coming up with an idea! I am a linguistics major so I was trying to brainstorm on Linguisticy ideas. My only idea so far would be to find an open corpus of interview or free speech and track certain features in there (maybe colloquialisms or grammatical features). Other than that, I saw someone mentioned Homeric epics and I am a major fan of them, especially The Iliad!

gabikeane commented 7 years ago

@kes213 your proposal looks very cool, and I'm excited to see where this is going. Being more familiar with the poems than the rest of us (I presume), what do you think you'll find?

gabikeane commented 7 years ago

@ajnewton1 you might be interested in the past Twitter Neologisms project. Social media is a great source text, particularly for things like topic modelling and network analysis. While it may be difficult to do a project on our very dynamic Github social situation, you may want to consider another social medium (with more focus and less action).

Blangzo commented 7 years ago

Pittsburgh

Having taken Latin throughout high school (and picked up a few words), I was thinking of going through old texts like Caesar and comparing different translations to see how they differ depending on the ideals and goals of the translator, however I would also love to work with @ttb11 on homer's texts. Those are also very interesting.

ghbondar commented 6 years ago

@Blangzo Check out the edition of Caesar's Gallic wars that was created in a previous semester: http://caesar.obdurodon.org/text.html Think about any specific texts. Graffiti from Pompeii or letters from Vindolanda come to my archaeological mind as well, if you want to add a graphical element as well (beyond to bar-graphs we will make in class).

gabikeane commented 6 years ago

@pab124 This sounds like an amazing project, particularly when you consider the linguistic and cultural differences between High British English and Irish English. I am also very interested in letters, particularly in the letters of those who expected some level of notoriety. The difference between writing for an audience and writing for the self may be something you could explore computationally.

ghbondar commented 6 years ago

@BMR59 Any specific languages or literature upon which you might want to focus? Remember that Dr. Na-Rae Han will also need to approve your project.

BMR59 commented 6 years ago

@ghbondar I considered trying to find something with Portuguese but I would be interested in @ttb11 's idea involving Homeric epics for sure.

jub45 commented 6 years ago

Definitely interested in working on/collaborating with any foreign language project!