Closed karindalziel closed 8 years ago
the process for this:
For each novel, look at the xml here: https://github.com/CDRH/austen/tree/master/public
and search for the word mentioned, and then note if it exists in the XML.
I may have Sara change the XML itself for anything except pride and prejudice, since Carmen has that file currently
Keep notes into a new comment below
Hi Karin, I went ahead and edited your previous comment that had all the errors listed. Almost all of them were non-existent, or they were so common I wasn't able to locate it by searching for two letters.
Since most of these seem to be errors coming from @bzillig1 's script, I am assigning to him for now.
Of note: If we can't figure this out, we may just need to add a "known issues" section somewhere where we list out these problems?
I believe this is done
Small errors caught by looking at unique words by character/narrator
MP
Sir Thomas hasbeen (can't find) Mrs. Norris havebeen, hertime (can't find either) Lady Bertram neither- (should probably be a long dash) Edmund attimes, ifpossible (can't find either) Fanny ot (should be “or”) Mrs. Grant takenin (can't find) Mary dispenseto (can't find), du (this is correct - part of a french phrase), myown (can't find), notveryoften (can't find), seemedvery (can't find), thattable (can't find) Tom mynamewasnorval (can't find), tobe (can't find) Mr. Rushworth thecount (can't find)
E
Emma almane, carosposo, ostalis Mr. Perry misstaylor Harriet has been, onerespect, rd, thatevening Mr. Elton nnight, se (split from one word, "se'nnight") Mr. Weston asweunderstand, fo (should be “to”) Frank allthatparty, amorpatriae, robinadair Miss Bates mustall
SS
Elinor ask- Col. Brandon to- Mrs. Palmer p
NA
Catherine ma Gen. Tilney se
P
Sir Walter table—contractions Charles M n Adm. Croft d Mrs. Smith au Mrs. Musgrove ll
PP
Mrs. Bennet esq (should be "esp." for "especially") Jane Bennet b, c (can't find) Lydia ll Kitty s-his-name (split from one phrase, "what's-his-name") Mr. Collins se, ennight, nnight (split from one word, "se'nnight") Mr. Gardiner e, edw ("e," is too common, can't find; should be "Edw." for "Edward") Mrs. Gardiner alittle, beenhis, p (can't find "alittle" or "beenhis"; "p" is too common, can't find)