Closed drdhaval2785 closed 2 years ago
Pl. see my mail 2 hrs back. I thought you would be preparing the file locally first and then (finally) upload to Github.
Yes, these were deliberately marked thus. I had used // mark wherever I felt a new para is to be there (for display) in the text.
-----------------------
Here is the mail content, for the benefit of others (if they happen to come here)--
I am sure you'd be looking 'inside' the data in my file once, to know the way it was 'made'-- say (a) the // mark for a para-break, (b) /.../ for
<ls>
names, (c) using xxx at the gender/lex column etc.Also the HWs with braced letter(s) are to be 'treated' to form althws; and of course, the comma separated group entries are to be separated out into individual entries.
These entries do not have their own body part, but only with the upasarga some body portion is given. And all upasargas start on a new line/para (//With zzz).
And pl. note that my notation is not page-column, but is page-seq no. (within the page)
How to accommodate this in the CDSL <pc>
format?
The simplest way is to leave the 'seq no.' and forget about adding the 'column' in the data.
My <p>
is the notation for 'parent' entry and <b>
is for child ('baby') entry, i.e. the comp. word
Yes. I am uploading the file to Github and uploading the sequential changes which are being made. This is just for keeping track of changes and see visually the changes made at every step. This is not for public display. Once the data is fine in this repository LRV, we will move it to more public csl-orig repository and update the scripts to regenerate all the details for LRV dictionary to display on Cologne server. I am looking into the text data and using the various markups you used to extract the relevant information.
These entries do not have their own body part, but only with the upasarga some body portion is given.
OK
Every entry in LRV starts with
$--
markup. The following 18 entries do not have that markup. They start with$//
mostly.@Andhrabharati , is there any reason why they are so, or some typo / algo error?