Open funderburkjim opened 7 years ago
But foregoing line-breaks of md.txt in favor of line-breaks at divs doesn't work too well for verbs, using the simple div-marker used above for devI. This is because verbs typically don't indicate the divisions (such as for prefixed forms) in the same way. For instance, compare verb SaMs in the two display forms:
While the md.txt digitization handles some line breaks, for others it uses a funky system for some other line breaks, particularly those which occur within bold text. The pattern seems to be
xxx {@X-@} First line
{@-Y@} xxx Next line
This gives a lousy line-break using the div at bold text beginning with '-'
rule which worked well with devI.
For instance, note bad line break at '-ana' -- surely it should be at 'harana'
To me, the upside in cases like 'devI' are outweighed by the downside in the other cases.
In other words, a more delicate analysis of where to put divisions is needed. Until then, we should just retain the line breaks as given in md.txt.
Comments?
To me, the upside in cases like 'devI' are outweighed by the downside in the other cases.
Agree.
For instance, note bad line break at '-ana' -- surely it should be at 'harana'
Sure, but in this manner, it's quite similar to देवी¦devī, f. of deva
. Indeed, not many words will be f. of
. Maby more will be of the har-a
and bál-a
type.
हर¦har-a, a. -° (ā, sts. ī) bearing, wearing; taking, conveying; bringing (news) to (prati); taking away, depriving or robbing of; surpassing; removing, dispelling, destroying; receiving, obtaining; (taking =) captivating; m. Destroy, ep. of Śiva; N.; (hár) -aṇa, a. (ā, ī) conveying, containing; taking away, removing; n. bringing, fetching; offering; carrying off, stealing, theft, abduction (of a girl); withholding; confiscation (of property); obtainment; removal or destruction of (-°).
and
बल¦bál-a, n. (sg. & pl.) might, power, strength, vigour; forcible means, force; validity; power of, expertness in (lc.); forces, troops, army (sg. & pl.): in., ab., °-, -tas, by force of = by virtue or by dint of (g. or -°);
with -tas
Even a tougher case for div rules:
धर्म [Cologne record ID=9945] [Printed book page 130-2] धर्म¦dhár-ma, m. established order, usage, institution, custom, prescription; rule; duty; virtue, moral merit, good works; right; justice; law (concerning, g. or -°); often personified, esp. as Yama, judge of the dead, and as a Prajāpati; nature, character, essential quality, characteristic attribute, property: in. dhārmeṇa, in accordance with law, custom, or duty, as is or was right; -°, after the manner of, in accordance with; dharme sthita, observing the law, true to one's duty.
Where there are bolded dhārmeṇa
and dharme sthita
.
dhārmeṇa and dharme sthita - Good addition to difficult cases.
Good addition to difficult cases.
There are millions of these. I would not dive into all the exclusions. Don't try to do everything perfect, Jim. Sometimes, at least. :rabbit2:
I'm not attempting to deal with correct addition of divs in MD or elsewhere now. Restricting attention to a) IAST conversion, and b) meta-line conversion. The question of divs comes up in the meta-line conversion, when the make_xml.py program and disp.php display have to be modified. My general approach for now is to make only modest improvements as such catch my eye. The more systematic addition to markup (including divs) I'm not dwelling on now.
modest improvements as such catch my eye
In this case understood.
make_xml.py program and disp.php display have to be modified
If so, indeed, why not add a simple layer, without being scrutine.
It is tempting to try to improve the MD (Macdonell) display by adding line breaks at certain places.
The extant md.txt markup suggests one possibility at bold text beginning with a '-'. This often nicely emphasizes sub-headwords when these are present. For instance contrast
to the current display: