paddymcall / SARIT-pdf-conversions

XML to PDF for SARIT texts
https://github.com/sarit/SARIT-corpus
2 stars 1 forks source link

"Speaker" lines #40

Closed ppasedach closed 8 years ago

ppasedach commented 8 years ago

I have a file containing the etext of the Skandapurāṇa project's so far published volumes. The Purāṇas and the Mahābhārata being closely related I'm looking at the mahabharata-devanagari.xml as my model, even though my file is in IAST.

I was wondering how to encode the "speaker" lines, "XXXX uvāca". I saw that in the Mahābhārata file

    <lg xml:id="adi-1-1-244" type="anuṣṭubh">
                        <l>इत्युक्त्वा धृतराष्ट्रोऽथ विलप्य बहु दुःखितः ।</l>
                        <l>मूर्च्छितः पुनराश्वस्तः संजयं वाक्यमब्रवीत् ॥</l>
                        <l xml:id="adi-1-1-245x" rend="speak">धृतराष्ट्र उवाच । </l>
                    </lg>
                    <lg xml:id="adi-1-1-245" type="anuṣṭubh">
                        <l>संजयैवं गते प्राणांस्त्यक्तुमिच्छामि मा चिरम् ।</l>
                        <l>स्तोकं ह्यपि न पश्यामि फलं जीवितधारणे ॥</l>
                        <l xml:id="adi-1-1-246x" rend="speak">सौतिरुवाच । </l>
                    </lg>

These lines were incorporated into the element of the verse preceding them, but having an xml:id whose number points to the next verse. This appears somewhat inconsistent to me, but maybe there's a good reason for doing it like that? Should I do it in the same way, or follow a different pattern?

ppasedach commented 8 years ago

I just see that in the brahmapurana.xml the speaker lines are encoded outside of the preceding or following <lg> elements:

       <lg type="stanza">
          <l>BRP001.020.1/ yataś caiva jagat sūta yataś caiva carācaram |</l>
          <l>BRP001.020.2/ līnam āsīt tathā yatra layam eṣyati yatra ca || 20 ||</l>
        </lg>
        <p>BRP001.021.0/ lomaharṣaṇa uvāca:</p>
        <lg type="stanza">
          <l>BRP001.021.1/ avikārāya śuddhāya nityāya paramātmane |</l>
          <l>BRP001.021.2/ sadaikarūparūpāya viṣṇave sarvajiṣṇave || 21 ||</l>
        </lg>

which corresponds very closely to the structure of the files I have now:

SP0020291: etajjñātvā yathāvaddhi kumārānucaro bhavet|
SP0020292: balavānmatisampannaḥ putraṃ cāpnoti saṃmatam|| 29||
SP0029999: iti skandapurāṇe dvitīyo 'dhyāyaḥ||
SP0030010: sanatkumāra uvāca|
SP0030011: śṛṇuṣvemāṃ kathāṃ divyāṃ sarvapāpapraṇāśanīm|
SP0030012: kathyamānāṃ mayā citrāṃ bahvarthāṃ śrutisaṃmitām|
SP0030013: yāṃ śrutvā pāpakarmāpi gacchecca paramāṃ gatim|| 1||
SP0030021: na nāstikāśraddadhāne śaṭhe cāpi kathaṃcana|
SP0030022: imāṃ kathāmanubrūyāttathā cāsūyake nare|| 2||

I was however thinking to not keep the meta information preceding each line as part of the text, but to separate it from it and use it in xml:ids.

paddymcall commented 8 years ago

It should go into n attributes, not xml:id-s. Please discuss on redmine.