Ethandaniel47 / EdgarAllanPoeProject

Analysis of language pertaining to capture/enclosure and release in regards to the poetry and prose of Edgar Allan Poe
1 stars 0 forks source link

Schema comments #8

Open djbpitt opened 5 years ago

djbpitt commented 5 years ago

@Guifindor Your schema looks good! Where you have different elements for the different parts of speech, though, it looks as if they all have the same content model, that is, they all have @pvalue attributes and plain text content. Your creation of different elements for each part of speech is perfectly okay, but one alternative might be something like:

lb = element lb { mixed { ( word | foreign | ref )* } }
word = element word { pos, pvalue, text}
pos = attribute pos { "noun" | " verb" | "adj" |  "adv"  |  "prep" |  "conj" |  "inter" |  "pron" | "det" } 
pvalue = attribute pvalue {text} 

I think the meaning of this version is the same as the meaning of yours, so what can be represented isn’t at issue. The difference is that if you decide you want to add another attribute to all your words, the revision would let you do it in just one place, while in your version you would have to modify all of the individual element declarations.

I’d also suggest calling your lines <l> or <line>. The reason is that <lb> traditionally means “line beginning”, and it is used for empty milestone elements. Since you have a container element (that is, separate start and end tags with content between them), those familiar with the traditional meaning of the element name <lb> may be confused.

MJB288 commented 5 years ago

Thanks for the heads up on the <lb> issue. I have adjusted our schema to now use the element <l> instead. As for your part of speech suggestion, I'll take it into consideration. I think we my be settled upon individual part of speech tags as our schema formation, but I'll bring up your idea at our meeting tomorrow.