Closed drdhaval2785 closed 3 years ago
When I tried to recreate the babylon files from current XML files
How many times per year you do it?
Last was 11 months back
Any idea why this occurs?
Also, what is the source you use for xml files -- do you directly recreate them from the pywork/xxx.xml files?
I guess why this happens.
dig_xml function in make_xml.py replaces the broken bar with space. Usually after the broken bar, there is a space already. When broken bar is replaced by space, we get two spaces consecutively in the XML.
https://github.com/sanskrit-lexicon/csl-pywork/blob/f4dbde8ceaf4dd79e41c6ead7a0c7549bf2e5c89/v02/distinctfiles/snp/pywork/make_xml.py#L63 seems to be the line to be changed.
I recreated xxx.xml by redo.sh from xxx.orig files.
When I was working on a change to make_xml.py yesterday for BUR, I began by making the change to distinctfiles/bur/pywork/make_xml.py . But when I regenerated bur, the change I made did not show up in the display! After some investigation, I noticed (in v02/inventory.txt) that make_xml.py is now a template:
; 10-11-2019: Changed make_xml.py from 'CD' to 'T'
*:pywork/make_xml.py:T
In other words, we started with the make_xml.py as distinctfiles, then (in October) I was able to get all the differences among dictionaries taken account in a template.
It is confusing to have those distinctfile make_xml.py programs -- we both were fooled by their presence.
Today I am going to change the names of all the distinctfile make_xml.py programs to unused_make_xml.py. Later, we can just delete the distinctfile unused_make_xml.py programs.
@drdhaval2785 Currently, I don't know how to recreate the stardict files. I suspect that you have scripts on your local installation that deal with this.
It would be good to have this process in a repository. One suggestion would be to create a new sanskrit-lexicon/csl-stardict repository just for the purpose of regenerating the stardict files from Cologne data.
If your recreation process uses the cologne xxx.xml files generated by make_xml.py, then the redo.sh of csl-startdict could directly use ../xxx/pywork/xxx.xml files of local installation.
From your comment above, I think you would like to avoid having two spaces that often occur in xxx.xml due to the handling of the broken bar when converting from xxx.txt.
I'll take a look at where this adjustment should be made, once you have the stardict regeneration up in a repository so I can recreate locally.
https://github.com/sanskrit-lexicon/cologne-stardict is the repo you asked for @funderburkjim ?
When I tried to recreate the babylon files from current XML files, one change struck me in majority of dictionaries.
Broken bar is replaced by space (which was not the case earlier). This has resulted in double spacing (snp) or a space followed by comma (skd).
I feel the broken bar should be replaced by blank..