Open birchamp opened 2 years ago
Below is a re-write of the conditions stated above:
Currently, gE scripture cards can show USFM files from Book Package Repos.
The user enters the URL for the repo and gE displays the currently selected book in the scripture card.
@birchamp In short, is the goal to make the following work?
Yes @theNerd247 that's correct
@birchamp A tCore project will also have to be opened as read-only in GWE - otherwise there could be a big mess if someone tries to edit and merge.
@birchamp @theNerd247 We should probably meet with @richmahn to talk through the best way to determine if it is a tCore project. Note in the example given there are two USFM files in the root folder. The one to use is en_kjv_eph_book.usfm
, which follows the same naming convention as the repo name. But we probably should use the manifest file to determine that it is a tCore project.
I agree. The manifest seems like the best place since we’re already grabbing that and parsing it for meta data (for translation notes, etc.)
On Sep 11, 2023, at 8:12 AM, Bruce McLean @.***> wrote:
@birchamp https://github.com/birchamp @theNerd247 https://github.com/theNerd247 We should probably meet with @richmahn https://github.com/richmahn to talk through the best way to determine if it is a tCore project. Note in the example given there are two USFM files in the root folder. The one to use is en_kjv_eph_book.usfm, which follows the same naming convention as the repo name. But we probably should use the manifest file to determine that it is a tCore project.
— Reply to this email directly, view it on GitHub https://github.com/unfoldingWord/gateway-edit/issues/342#issuecomment-1713757488, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAG6ND5GUADE6H6TCQW242LXZ354XANCNFSM53WYSXJQ. You are receiving this because you were mentioned.
@PhotoNomad0 @birchamp Are there naming standards for scripture resources? The reason this bug exists is:
manifest.yaml
This is what works: https://git.door43.org/api/v1/repos/birch/en_kjv_eph_book/contents/manifest.json?ref=master
GE looks for the this: https://git.door43.org/api/v1/repos/birch/en_kjv /contents/manifest.yaml?ref=master
By "works" I mean manually querying the git.door43.org API
The parser for the url is found at: single-scripture-rcl. I'm not sure if fixing the parser is the correct approach here or if the repo given above should have its name changed. @birchamp I looked through the git log and noticed that the repo you gave was ported from some legacy system.
Do we have documented a place where tC encodes its info and where gE decodes?
@theNerd247 @birchamp I checked into the specs for the tC manifest.json
and discovered that there is none recorded. The plan was to transition from the manifest.json
in tC to the resource container format in http://resource-container.readthedocs.io/en/v0.2/manifest.html . But that never happened. We started off with the manifest.json
of translationStudio (see https://ts-info.readthedocs.io/en/latest/manifest.html?highlight=manifest). But the format changes were not documented - just codified.
That being said. There really isn't much point in looking in the manifest.json
anyway - there is no reference to the contained usfm that matches the tCore repo name (e.g. en_ult_sng_book.usfm
in https://git.door43.org/Grant_Ailie/en_ult_sng_book
)
Here is the naming spec for repos on Door43: https://git.door43.org/unfoldingWord/registry
Suggestions for url validations going forward:
manifest.yaml
then we check if the repo contains a usfm file that matches the repo name. If so we will use the URL that points to the usfm file itself.Then at GWE run-time when we see that the user selected an usfm url:
\id
field with a valid book code.
Hey @PhotoNomad0! Thanks for finding this. Let me dig through the code for single-scripture-rcl to see what it would take to make these changes.
On Sep 13, 2023, at 3:40 PM, Bruce McLean @.***> wrote:
@theNerd247 https://github.com/theNerd247 @birchamp https://github.com/birchamp I checked into the specs for the tC manifest.json and discovered that there is none recorded. The plan was to transition from the manifest.json in tC to the resource container format in http://resource-container.readthedocs.io/en/v0.2/manifest.html . But that never happened. We started off with the manifest.json of translationStudio (see https://ts-info.readthedocs.io/en/latest/manifest.html?highlight=manifest). But the format changes were not documented - just codified.
That being said. There really isn't much point in looking in the manifest.json anyway - there is no reference to the contained usfm that matches the tCore repo name (e.g.)
Here is the naming spec for repos on Door43: https://git.door43.org/unfoldingWord/registry
Suggestions for url validations going forward:
if the user enters an url that points directly to a usfm file, then we check if the usfm file exists. If so we will use it. (this behavior is the same as tCore) when user enters an url that points to a repo and that repo does not contain a manifest.yaml then we check if the repo contains a usfm file that matches the repo name. If so we will use the URL that points to the usfm file itself. Then at GWE run-time when we see that the user selected an usfm url:
if it is not loadable we show the usual "content is not found" message. and since the USFM is the wild west, we should check the header to make sure there at least is an \id field with a valid book code. then in GWE we only show the content when they have navigated to the book that matches the id in the usfm file. — Reply to this email directly, view it on GitHub https://github.com/unfoldingWord/gateway-edit/issues/342#issuecomment-1718213017, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAG6ND3BJJZSN4WPUO7PQLLX2ID23ANCNFSM53WYSXJQ. You are receiving this because you were mentioned.
I talked to Benjamin from the content team today about the need for importing from repos created in tCore. He's not aware of any immediate need for it but has started a thread with the content team on Zulip.
@birchamp Could you give more insight to why this feature is needed? I'd like to understand that first before starting the work on backwards compatibility (per @PhotoNomad0's comment above)
After having more dialog with the content team I am discovering that the "real" issue is the synchronization of work between tC and gateway-edit in git. The mechanisms used by tCore are potentially outdated? I'm aware that this issue has a larger history that I'm not aware of and it seems that this issue is either a small "hack" to resolve a pressing problem or a small step towards migrating users away from tCore.
I'm wondering if it would be worth having a discussion about which features our users are still using in tCore and if we might better spend our time porting those features over to gateway edit. Some of these features includes:
n >= 3
versions of a given text@birchamp Could you provide some direction regarding this?
@theNerd247 It was in a content meeting they said that when someone makes changes in tC it would be good to be able to see them in gE. But I'm wondering if as soon as gE gets scripture merging if this will be needed at all. I looked through the chat on Zulip and I think that the same thing applies we need to get editing and merging working and then see if they still want the feature.
@theNerd247 I'm deprioritizing this for now.
@birchamp In the TOT meeting with content team (refer to the recording), it was mentioned several times it would help if they could view other bibles in the GWE as they are working and not have to open a different tool.
Notes:
tCore book repos:
en_kjv_eph_book
en_kjv_eph_book.usfm
whereas - scripture-resource-rcl is hard coded to load scripture repos that:
en_ult
)@PhotoNomad0 the metadataType property of an API repo or catalog entry object will tell you what it is: rc, tc, sb or ts
metadataVersion also may help
@birchamp @theNerd247 We should probably meet with @richmahn to talk through the best way to determine if it is a tCore project. Note in the example given there are two USFM files in the root folder. The one to use is
en_kjv_eph_book.usfm
, which follows the same naming convention as the repo name. But we probably should use the manifest file to determine that it is a tCore project.
DCS determines metadataType by unmarshalling the manifest or metadata file into various Go structs. When it gets one that matches, it knows metadata type and vision and gives that to you from the API.
Make sure to also query the repos available with metadataType= (multiple are ordered) for what your app supports.
https://git.door43.org/api/v1/repos/search?metadataType=rc&metadataType=tc etc.
Currently, gE scripture cards can show USFM files from Book Package Repos. The user enters the URL for the repo and gE displays the currently selected book in the scripture card.
Additionally, users should be allowed to enter the address of a tC project that has been uploaded to DCS in order to view the scripture from that project in gE, if the project contains the matching book to what is selected in the reference bar. If the book of the Bible in the project does not match the selected book in gE, then the scripture card will show the standard message when data is unavailable. Sample repo: https://git.door43.org/birch/en_kjv_eph_book/src/branch/master
DoD:
Entering the top-level DCS URL for a tC project in the scripture picker will allow the user to view the scripture from that project in the scripture card.
Details:
@PhotoNomad0 notes that we already have a way to know if a repo is a tC project. The USFM file with alignment data is stored in the root directory of the project repo.