Closed irahopkinson closed 7 months ago
Agree with this. I'll be adding a PR for usx2usj.py
@kavitharaju @joelthe1 Please consider what Ira has pointed out. I think the strip() should be removed here (line 52). Would you agree?
if child.tail and child.tail.strip() != "":
out_obj['content'].append(child.tail.strip())
Thanks @klassenjm. It isn't quite as simple as just removing the strip()
(at least it wan't in the JS version). If it helps, I took the JS version from here that I think @kavitharaju wrote, turned it into TS and fixed up whitespace. A JS version is available in our NPM package. BTW is there a JS version (or even Python) that goes back the other way, i.e. USJ to USX?
Actually just looking at the Python, it might be as simple as just removing the strip from https://github.com/usfm-bible/tcdocs/blob/c2188e076796e98f925074261b2b2cdc042f8a59/python/scripts/usx2usj.py#L33 and https://github.com/usfm-bible/tcdocs/blob/c2188e076796e98f925074261b2b2cdc042f8a59/python/scripts/usx2usj.py#L50
@mhosken Could you give direction to @irahopkinson about the Python tools for USFM<>USX<>USJ transformation.
Have sent a PR #65 with the suggested change in whitespace handling.
@irahopkinson , the scripts in the https://github.com/mvh-solutions/nice-usfm-json repository are no longer maintained. We used that space for collaborating within the working group during the early stages of forming the USJ specs. The recent works are in this repo https://github.com/usfm-bible/tcdocs
@kavitharaju thanks, I figure it was something like that already and have been mostly looking in the https://github.com/usfm-bible/tcdocs repo since then. I now need to update my code for the "USJ-0.0.1" schema. I notice updated origin.json
files in #65 have version at 0.2.1. Is there a relationship between the version in the USJ files and the version in the schema?
@kavitharaju and @mhosken also in the schema, now that you have split type
into type
and marker
, should marker
also be required? https://github.com/usfm-bible/tcdocs/blob/6b86da6ce7fe4972ef01d663c62e185670b3ade8/grammar/usj.js#L62
@irahopkinson Thank you for pointing out the version miss match in schema. We keep bumping the version in the origin.json files as we make fixes or other changes to USJ now. The schema, since is very high-level , has not been updated much and we missed bumping the version there. But that schema still holds for the USJ samples I guess, except may the small details like the "required-marker".
Yes, marker is also required now as per the update we made splitting the type field.
AFAIU whitespace inside elements is meaningful in USX, so the corresponding USJ should retain that whitespace.