DigitalMitford / DM_processing

a repo for working on processing for the Digital Mitford project, including schemas, XSLT, XQuery, and other production and analysis efforts
http://digitalmitford.org
GNU Affero General Public License v3.0
8 stars 3 forks source link

Task List (18 October 2015) #7

Closed ezimmer closed 7 years ago

ezimmer commented 8 years ago

Below is a general description of tasks we've discussed. @mollyodonnell @ezimmer @ebeshero We have made this, as a start: http://dxcvm05.psc.edu:8080/exist/rest/db/output/MissJames_TableOutput.html

One major goal for Lyon is to get a live query running from exist. (The desired complexity, output, and rendering of that query would determine the languages involved and approaches taken.)

ebeshero commented 8 years ago

XQuery Outputs to Try:

Imagine a tripartite output in a browser window. At the center is $Miss_James. Radiating in three directions are the top three most important categories of named entity connected with her in the body paragraphs of our Mitford corpus.

mollyodonnell commented 8 years ago

@ebeshero and @ezimmer, just spent a lot of time going through 1819 in Pitt Box collated archive (RCL). Letters that were coded and correctly tagged for Miss James (ref= "#James_Miss"), I left as is. Letters that needed coding for that tag, I updated.

(Just fyi, I have a note in with Lisa because it seems these might be active with her students, so wondering if previously coded files were moved for that reason (and either need to be moved again or?). Will let you know what I hear.)

I'm checking in to ask what next? There are a few good letters I came across (one where MRM calls James's letter silly and vulgar & another where James is calling a man/char names), though not all 1819.

Should I clean the letters I updated for James_Miss further? Work on getting some kind of code up for uncoded 1819 letters that might have ref to her? Look to other years? Update journal entries or other non-letter material ref her?

Input appreciated.

ezimmer commented 8 years ago

@mollyodonnell , that's great! Let me look on Box in places you've referenced--will write back again soon. The further letters sound fantastic, too. (Many thanks for all the work!)

mollyodonnell commented 8 years ago

@ezimmer, the other option I don't list is to go back and clean code on a deeper level for all files containing @ref = “#James_Miss” those that I updated yesterday and those that I didn't. I'll probably just do that when time. Was just trying to see if there was something more pressing depending on what you're querying and from where. (I assume you're pulling from Pitt Box where I'm updating, but given Lisa's student's activity this week...let me know if that's not the case. I think when we learned a little Xquery in the workshop, though, the returns were from Box archive.)

ezimmer commented 8 years ago

Hi @mollyodonnell -- thanks again! The queries we've been running have been on the eXist database, so I think @ebeshero would probably have to transfer over the files that have been cleaned, as part of the overall work. It's exciting that Lisa's students are working on them, too--maybe we could schedule a Skype with her sometime to talk about the discovery, too, as she'd mentioned that possibility? (Know everyone is crazy busy right now, but maybe early October? Just a thought.)

ebeshero commented 8 years ago

@mollyodonnell @ezimmer Sorry I've not chimed in for a couple of days! Yes, we'd be downloading copies of what's in Box over to the eXist database. I wrote to Lisa to let her know what we're up to, and she was happy to have people help with proofing and polishing, and gave me a list of letters in the spreadsheet that are ready to go for that. Here is the list: (Letters 1730, 2014, 1731, 1732, 1733, 1734, 1735, 1736, 1738, 1765, 1767, 0528, 0529, 0532, 0531, 0533, 0530, 0534, 0535, 0536, 1778, 0538, 0540, 0541, 0543, 0546).

Lisa says she's assigned THESE letters to students, but they are note coded yet: 2018, 2019, 1739, 1740. (A strange thought is that if you've looked some of these and they seem useful to us, you might just save a working conference-prep copy only for our eXist database work, and keep it out of Box but post here to GitHub. It's probably not the best practice, but later on we could compare the files so our work doesn't get in the way of the students! But perhaps we'd best prioritize the bigger list above.)

The letters are by xml:id in our Letters database spreadsheet (now in our MySQL database)--the fastest way to find them, though, is in the Box directory in MRMS Project Support here: https://pitt.box.com/s/s7lp3728uzxawg1knnf3

Molly, I like the idea of simply completing the coding on the unfinished letters, if they've not been assigned to others. Actually, we should write to the following: Lisa Wilson Amy Gates Amy Colombo The two Amys are editors on our team, one of them quite new, and they're happy to help with proofreading and code-completion tasks. I'd assigned Amy Gates a couple of letters in 1819. I'm swamped this week, so I'm badly lagging with contacting and coordinating kinds of tasks...Feel free to write them and explain what we're working on! :+1: I know they all want to help anyway.

ebeshero commented 8 years ago

@mollyodonnell Good! Lisa's pinging you from the Box now and labelling stuff so it should be clear what's open for us to work on now.

mollyodonnell commented 8 years ago

@ebeshero great. Will update the ones she's pinging me on and assume she'll let me know to move to Github anything she doesn't want her students to access (previously coded versions of letters they are working on, a surprising number that I updated a little yesterday). Will let you know when each she's funneling to me are ready for you. (You can see what I updated last night in Box.) More soon.

ezimmer commented 8 years ago

Just a quick update, @mollyodonnell (and @ebeshero): it looks like we're scheduled for Friday morning. (Here's the program: http://tei2015.huma-num.fr/en/overview/.)

mollyodonnell commented 8 years ago

@ezimmer thanks!

mollyodonnell commented 8 years ago

@ebeshero, pinged you in box with new upload of letter & will do this for others I check as well. Lisa seemed to think there wouldn't be a conflict with her students, but let me know if I should be pushing them here instead.

ebeshero commented 8 years ago

@mollyodonnell @ezimmer Okay--here's a quick synopsis of the Box discussion that Molly, Lisa Wilson, and me, together with some notes for ready reference.

_Proofreading Checklist_ Elizabeth made a proofreading checklist for use in the project: https://pitt.box.com/s/sm3kf4uzzn6zlfzo8in6etw1l50yirb2 It's designed to go with reading a transformed HTML file you can generate with our project's XSLT for letters: You can then read an HTML version of your letter that looks like the ones we've posted on our website, and that will help with proofreading against the manuscript to catch spacing errors in our tagging and such. For right now, Molly, we don't need to worry about the file transformation unless you want to give it a try. The most important parts of this are in the Workflow and Tasklist sections.

_Questions for Molly re Proofreading:_

*Are you adding your name as a proofreader to the TEI header, in the  for "Proofing and corrections by...? You should do that, whether you've completed work with proofreading or just done a few specific clean-up things. But you also want to include some kind of flag to indicate whether proofreading is complete or not--and I think that should be simply a comment tag  . I can hunt for comment tags inside  elements, and that will tell me at what stage the proofreading is for any given letter.

Thanks Molly! Erica, I've not had time to breathe this past week, but I'll be working hard on XQuery and SVG shortly and ping you soon!

mollyodonnell commented 8 years ago

@ebeshero @ezimmer. Thanks for checklist and file-naming reminder.

I've been looking at the manuscript when the text seems wrong (misspelling, typos, etc.) or to do the markup on say gaps, supplied, etc. but not to verify the whole text.

Will mark all letters I have/will work on with some version of e.g., comment in the header notes so you can see exactly where/what I've done.

mollyodonnell commented 8 years ago

@ebeshero, think something got cut off in both places (here and box)...you write: "At minimum (for the TEI conference) we need an accurate." I want to be sure the minimum nec. header edits are accounted for, so let me know.

ezimmer commented 8 years ago

@ebeshero @mollyodonnell Thank you both so much. It's been a crazy week here too, but I'm working toward some simple SVG and will update very early this week. Thank you again!

ebeshero commented 8 years ago

Timing

I've been working with Lisa Wilson on coordinating editors to clean up files and target Miss James letters among other things. Lisa noticed a good number of Miss James letters in the year 1821, for example, so we're concentrating efforts on those and others.

Here's a time schedule we've put together for the clean-up and preparation of well-formed and thoroughly tagged letters files in time for us to present our findings at the TEI Conference, but with enough time for our editors to do a reasonably good job:

In the meantime, I and Erica can work within not-quite-finished / still-rough collections of files just to work out how to generate our visuals. I'll pull all the files we have ready at various points in October: I'll try it once this weekend, again on Oct. 9, and yet again on Oct. 16.

I'll ping the whole Mitford team about this, too, but I wanted to record it here so we have a sense of project timing! @ezimmer @mollyodonnell

mollyodonnell commented 8 years ago

@ezimmer and @ebeshero, glad we're moving to focusing on 1821 because I did a really basic clean on anything that came up on a search for Miss James, and a much deeper clean and proof in 1819, but I think Lisa and I were finding what she and Elisa already suggested: that 1819 isn't the biggest Miss James year among the years in the archive.

ezimmer commented 8 years ago

This is great, both of you.

Thank you so much.

More soon!

@ebeshero

@mollyodonnell

On Sat, Oct 3, 2015 at 12:51 PM, mollyodonnell notifications@github.com wrote:

@ezimmer https://github.com/ezimmer and @ebeshero https://github.com/ebeshero, glad we're moving to focusing on 1821 because I did a really basic clean on anything that came up on a search for Miss James, and a much deeper clean and proof in 1819, but I think Lisa and I were finding what she and Elisa already suggested: that 1819 isn't the biggest Miss James year among the years in the archive.

— Reply to this email directly or view it on GitHub https://github.com/ebeshero/mitford/issues/7#issuecomment-145272432.

ebeshero commented 8 years ago

@mollyodonnell @ezimmer I'm revisiting this Issue, simply because we're a week away. We have plenty to talk about, and we're about to have plenty to show. We need to work out what each of us will present and how we will go about it.

Here's a sort of rough sketch I have in my head from what we've been discussing all along:

MODIFY at will! I just wanted to put something up here to get us rolling...

mollyodonnell commented 8 years ago

@ebeshero @ezimmer Sounds and looks good! I'll get started on my part & share as I go.

ezimmer commented 8 years ago

The one significant change I'd suggest at this point would be potentially beginning with a focus on annotation, at least briefly.

If we don't do that, the argument is likely to be more about the project than the transferable insight.

What do you both think? @ebeshero @mollyodonnell

On Sun, Oct 18, 2015 at 2:41 PM, mollyodonnell notifications@github.com wrote:

@ebeshero https://github.com/ebeshero @ezimmer https://github.com/ezimmer Sounds and looks good! I'll get started on my part & share as I go.

— Reply to this email directly or view it on GitHub https://github.com/ebeshero/mitford/issues/7#issuecomment-149037612.

ebeshero commented 8 years ago

@ezimmer Yes--good idea! Let's have you launch the talk, then, with a discussion of annotation, and then Digital Mitford is the case study to which this is all applied! I just modified this above. Hmmm. Shall I turn this into a wiki page here in GitHub? I think I will...it'll be easier for us to find and keep modifying this week and next. @mollyodonnell

ebeshero commented 8 years ago

@ezimmer and @mollyodonnell Here's a wiki page I just created, which should be easy for us to find and update as we're working now: https://github.com/ebeshero/mitford/wiki/Plans-for-TEI-Conference-in-Lyon

ezimmer commented 8 years ago

Re: this, yes. One of the overall ways in which I think we could frame this (if I haven't said this already--sorry if it's a duplicate!) is the idea of ways to find what may be hiding in plain view. (That's often what scholarly annotation does, given the range and depth of knowledge the annotator brings to the project.)

More on this soon! @ebeshero @mollyodonnell

mollyodonnell commented 8 years ago

@ebeshero @ezimmer just pushed my slides to Github in a new folder I called "Conference slides." Hope this is how we're doing it. If not, I'll be happy to put somewhere else. Be sure to download to look because the pdf doesn't display correctly in the preview window of Github. My plan is to use these to intro the sides to describe the different sides of Miss James (why she’s interesting: Miss James as a significant person in Mitford’s life and as opinionated, sassy, mysterious, kind, hardworking, and brave), her discovery story with Lisa’s student, our team’s code cleanup initiative (allude to cf ms, tagging, SI entry updates, and TEI header cleanup and ID additions). Simultaneously I’ll show and talk about the TEI (as you can probably guess from the slides). As I mentioned in my note, this can be cut down as need-be for time.

athenerica2003 commented 8 years ago

This is great, @mollyodonnell--thank you! Working with the concrete example will let me back-revise the intro, which is helpful, too.

@ebeshero, safe travels!

On Fri, Oct 23, 2015 at 7:43 PM, mollyodonnell notifications@github.com wrote:

@ebeshero https://github.com/ebeshero @ezimmer https://github.com/ezimmer just pushed my slides to Github in a new folder I called "Conference slides." Hope this is how we're doing it. If not, I'll be happy to put somewhere else. Be sure to download to look because the pdf doesn't display correctly in the preview window of Github. My plan is to use these to intro the sides to describe the different sides of Miss James (why she’s interesting: Miss James as a significant person in Mitford’s life and as opinionated, sassy, mysterious, kind, hardworking, and brave), her discovery story with Lisa’s student, our team’s code cleanup initiative (allude to cf ms, tagging, SI entry updates, and TEI header cleanup and ID additions). Simultaneously I’ll show and talk about the TEI (as you can probably guess from the slides). As I mentioned in my note, this can be cut down as need-be for time.

— Reply to this email directly or view it on GitHub https://github.com/ebeshero/mitford/issues/7#issuecomment-150717763.