Open funderburkjim opened 1 year ago
metalines ending in <e>#X and accompanying <LEND>
Note Found 4 similar lines not marked as '; delete'
This is corrected in the current update, to be pushed yet.
tab character before broken bar
One exception
This is corrected in the current update, to be pushed yet.
removal of <h>a markup in metaline
Note: This may have impact on a list display feature.
Not just the 'a'; I had removed all the letters ('a' to 'r') in such places!! I will give my reason soon about this (in brief, this needs some 'major revision').
Curly bracket in Sanskrit lex ?
These are the word endings, with or without accent marks, that decide the lexical info of the HW. And these might better be 'shown' in a different style than the HW style, so as to differentiate them appropriately.
Some examples are-- Here the accent mark is to be removed in the HW!
⓾ U+24FE Double Circled Number 10 Why doubled?
Suggest use U+2469 Circled Number 10
This is missed earlier and corrected in the current update, to be pushed yet.
Note
<cl>
tag is new to one.dtd (mw.dtd)
There are some more tags introduced in my working--
<ar> </ar>
<cl> </cl>
<col> </col>
<div n="cf"/>
<div n="cl"/>
<div n="p"/>
<ety> </ety>
<gk> </gk>
<lang> </lang>
<ln> </ln>
<nt> </nt>
<pcol> </pcol>
<pe> </pe>
<pg> </pg>
<sch> </sch>
Some of these are revisions/extensions of the present tags, and some are newly introduced.
//add and //rev
833 matches in 832 lines for "//" in buffer: mw_AB.txt
Purpose of this markup? Examples:¦ a ray, sunbeam; //rev(1308,1) ¦ night, <ab>ib.</ab> //add(1308,1)
Here some revision is made wrt the print text; so the user somehow is to be informed about this change.
Just a comparison with the current CDSL display, which is not so 'appealing' as the above style.
------------------------------------
Here some addition is made wrt the print text; so the user somehow is to be informed about this change.
Pushed the latest update of mw_AB.txt
An additional xml element noticed: cse.
Request a brief usage description for the new xml elements. This description will help in modifying the displays.
A new dash character noticed after commit 371f2da1.
693 matches in 367 lines for "‒" in buffer: mw_AB.txt
This is Figure Dash (General Punctuation) U+2012
This was noticed as an unexpected character within <s>xxx</s>
.
81 matches in 49 lines for "<s>[^<]*‒" in buffer: mw_AB.txt
Request @Andhrabharati describe the purpose of using this particular character. How does intended use compare with intended use of — Em Dash U+2014 ?
I will describe the different tags and characters I have opted to use, during today's session.
For now, just note that I changed <ety> ...</ety>
to <cog> ... </cog>
A new dash character noticed after commit 371f2da.
693 matches in 367 lines for "‒" in buffer: mw_AB.txt
This is Figure Dash (General Punctuation) U+2012
This was noticed as an unexpected character within
<s>xxx</s>
.
In fact, the double ‒‒ is the Em Dash employed in MW print, as a 'correlative' element. The Em Dash used in mw.txt is more like an En Dash in the MW print. I will prepare the issue on this today.
A single ‒ is more like an En Dash in MW print (to denote a range); I opted to use the Figure Dash in its place (to distinguish the two characters), as I am thinking of replacing the Em Dashes (in the file) with En Dashes once for all, after describing the various discrepancies in an issue.
And of course, at the end, this Em Dash > En Dash
shall be further changed as a > hyphen
!!
Changed in my current revision (in addition to vast many other changes)--
"‒‒" (figure-dash twice, U+2012) to Em Dash [between the correlative terms etc.];
"‒" (figure-dash, U+2012) to En Dash [denoting a range of numbers etc.] and
"—" (Em Dash) to a Box Drawing Heavy Horizontal (U+2501) [so that the 'sense' is still retained (JIC needed at sometime), but is shorter and not too distracting than the too wider Em Dash that is used at present]
Also introduced a minus sign ("−", U+2212) at line 775433 [‘20 − 7 <ab>i. e.</ab>
13’]
These comments made on mw_ab as of commit 52c972817f25b591784c66b81e428b606336bcd1
orig
all 4 mw files have the same number of lines.
features of mw_AB.txt
'; delete' lines
Marks lines to be removed in mw_AB. These '; delete' lines are placeholders that facilitate comparison The lines so marked include:
metalines ending in
<e>#X
and accompanying<LEND>
Note Found 4 similar lines not marked as '; delete'
tab character before broken bar
All broken bar characters are preceded by a 'tab' character One exception
<srs/>
markup changeLeft-Pointing Angle Bracket U2329 and Right-Pointing Angle Bracket U232A enclose vowel (or accented vowel) Example:
homonym markup
⒈ U+2488 Digit One Full Stop replaces
<hom>1.</hom>
⒉ U+2489 Digit Two Full Stop replaces<hom>2.</hom>
⒊ U+248A Digit Three Full Stop replaces<hom>3.</hom>
⒋ U+248B Digit Four Full Stop ⒌ U+248C Digit Five Full Stop ⒍ U+248D Digit Six Full Stop ⒎ U+248E Digit Seven Full Stop ⒏ U+248F Digit Eight Full Stopremoval of
<h>a
markup in metalineNote: This may have impact on a list display feature. Example:
Curly bracket in Sanskrit lex ?
Example
//add and //rev
833 matches in 832 lines for "//" in buffer: mw_AB.txt
Purpose of this markup? Examples:cl. markup
The verb classes markup changed. Example:
Special unicode characters ⓵ U+2460 Circled Digit One ... and others in the Unicode Block “Enclosed Alphanumerics” ⓾ U+24FE Double Circled Number 10 Why doubled? Suggest use U+2469 Circled Number 10
Note
<cl>
tag is new to one.dtd (mw.dtd)