suttacentral / bilara

Our Computer Aided Translation software
10 stars 8 forks source link

Bilara i/o fail: add debaked Vinaya files #39

Closed sujato closed 5 years ago

sujato commented 5 years ago

As with the formerly debaked Sutta files, there are certain Vinaya texts requiring debaking.

Debaking is required when the file name is a range, but the segments within that file can best be expressed as individual texts within that range. This applies in the case of the Vinaya, since we want individual rules to link to specific places.

I have added the debaked text for bi-vb-pd as a separate issue. #37

This issue applies to three texts, namely

I have exported them via bilara i/o and the corrected data lives as ods files in /.scripts.

https://github.com/suttacentral/bilara-data/tree/master/.scripts

However import fails with the following message:

./sheet_import.py pli-tv-bi-vb-as1-7.ods
Traceback (most recent call last):
  File "./sheet_import.py", line 69, in <module>
    file = get_file(uid, field)
  File "./sheet_import.py", line 50, in get_file
    raise ValueError('Could not find file for {}_{}'.format(uids, muids))
NameError: name 'uids' is not defined
blake-sc commented 5 years ago

I've completed this.

Note that for internal reasons the sheet_import.py is not a suitable tool for this kind of task, as the sheet_import tool bases the changes it makes on the segment ids (i.e. it uses the segment ids to know what to change). One example of how this could go wrong is it's possible to export a spreadsheet which contains only a limited subset of the data and the expectation is that the import would update the related files.

Since changing segment ids is "weird" I wrote a new script renumber_json_segments.py this takes two parameters which should be the original spreadsheet and the spreadsheet with the changed segment ids i.e.
renumber_segment_ids.py --original org/pli-tv-bi-vb-sk1-75.ods pli-tv-bi-vb-sk1-75.ods

It compares the segment ids in the two spreadsheets and then propagates that change in segment id to all files. I considered building this functionality into the import script, but it's easier to validate that a script which does just one thing is working correctly.

sujato commented 5 years ago

Okay, thanks, can you please write explicit instruction how to use this in the README. If there is just the code with no instructions, it might as well not exist.

sujato commented 5 years ago

Looks good.