unfoldingWord / gateway-edit

Book Package harmonized view.
https://gatewayedit.com
MIT License
1 stars 4 forks source link

Users can choose to view scripture from tC projects uploaded to DCS #342

Open birchamp opened 2 years ago

birchamp commented 2 years ago

Currently, gE scripture cards can show USFM files from Book Package Repos. The user enters the URL for the repo and gE displays the currently selected book in the scripture card.
Additionally, users should be allowed to enter the address of a tC project that has been uploaded to DCS in order to view the scripture from that project in gE, if the project contains the matching book to what is selected in the reference bar. If the book of the Bible in the project does not match the selected book in gE, then the scripture card will show the standard message when data is unavailable. Sample repo: https://git.door43.org/birch/en_kjv_eph_book/src/branch/master

DoD:

Entering the top-level DCS URL for a tC project in the scripture picker will allow the user to view the scripture from that project in the scripture card.

Details:

@PhotoNomad0 notes that we already have a way to know if a repo is a tC project. The USFM file with alignment data is stored in the root directory of the project repo.

theNerd247 commented 1 year ago

Below is a re-write of the conditions stated above:

Currently, gE scripture cards can show USFM files from Book Package Repos.

The user enters the URL for the repo and gE displays the currently selected book in the scripture card.

theNerd247 commented 1 year ago

@birchamp In short, is the goal to make the following work?

Screenshot 2023-09-07 at 1 53 21 PM
birchamp commented 1 year ago

Yes @theNerd247 that's correct

PhotoNomad0 commented 1 year ago

@birchamp A tCore project will also have to be opened as read-only in GWE - otherwise there could be a big mess if someone tries to edit and merge.

PhotoNomad0 commented 1 year ago

@birchamp @theNerd247 We should probably meet with @richmahn to talk through the best way to determine if it is a tCore project. Note in the example given there are two USFM files in the root folder. The one to use is en_kjv_eph_book.usfm, which follows the same naming convention as the repo name. But we probably should use the manifest file to determine that it is a tCore project.

theNerd247 commented 1 year ago

I agree. The manifest seems like the best place since we’re already grabbing that and parsing it for meta data (for translation notes, etc.)

On Sep 11, 2023, at 8:12 AM, Bruce McLean @.***> wrote:

@birchamp https://github.com/birchamp @theNerd247 https://github.com/theNerd247 We should probably meet with @richmahn https://github.com/richmahn to talk through the best way to determine if it is a tCore project. Note in the example given there are two USFM files in the root folder. The one to use is en_kjv_eph_book.usfm, which follows the same naming convention as the repo name. But we probably should use the manifest file to determine that it is a tCore project.

— Reply to this email directly, view it on GitHub https://github.com/unfoldingWord/gateway-edit/issues/342#issuecomment-1713757488, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAG6ND5GUADE6H6TCQW242LXZ354XANCNFSM53WYSXJQ. You are receiving this because you were mentioned.

theNerd247 commented 1 year ago

@PhotoNomad0 @birchamp Are there naming standards for scripture resources? The reason this bug exists is:

  1. the URL parser that gateway-edit is using is not parsing the repo name correctly
  2. the manifest file being fetched is a manifest.yaml
    This is what works:    https://git.door43.org/api/v1/repos/birch/en_kjv_eph_book/contents/manifest.json?ref=master
    GE looks for the this: https://git.door43.org/api/v1/repos/birch/en_kjv         /contents/manifest.yaml?ref=master

By "works" I mean manually querying the git.door43.org API

theNerd247 commented 1 year ago

The parser for the url is found at: single-scripture-rcl. I'm not sure if fixing the parser is the correct approach here or if the repo given above should have its name changed. @birchamp I looked through the git log and noticed that the repo you gave was ported from some legacy system.

  1. Are there still legacy artifacts left behind that need to be cleaned up?
  2. Are there other repos that act as an example tC projects uploaded to DCS?

Do we have documented a place where tC encodes its info and where gE decodes?

PhotoNomad0 commented 1 year ago

@theNerd247 @birchamp I checked into the specs for the tC manifest.json and discovered that there is none recorded. The plan was to transition from the manifest.json in tC to the resource container format in http://resource-container.readthedocs.io/en/v0.2/manifest.html . But that never happened. We started off with the manifest.json of translationStudio (see https://ts-info.readthedocs.io/en/latest/manifest.html?highlight=manifest). But the format changes were not documented - just codified.

That being said. There really isn't much point in looking in the manifest.json anyway - there is no reference to the contained usfm that matches the tCore repo name (e.g. en_ult_sng_book.usfm in https://git.door43.org/Grant_Ailie/en_ult_sng_book)

Here is the naming spec for repos on Door43: https://git.door43.org/unfoldingWord/registry

Suggestions for url validations going forward:

Then at GWE run-time when we see that the user selected an usfm url:

theNerd247 commented 1 year ago

Hey @PhotoNomad0! Thanks for finding this. Let me dig through the code for single-scripture-rcl to see what it would take to make these changes.

On Sep 13, 2023, at 3:40 PM, Bruce McLean @.***> wrote:

@theNerd247 https://github.com/theNerd247 @birchamp https://github.com/birchamp I checked into the specs for the tC manifest.json and discovered that there is none recorded. The plan was to transition from the manifest.json in tC to the resource container format in http://resource-container.readthedocs.io/en/v0.2/manifest.html . But that never happened. We started off with the manifest.json of translationStudio (see https://ts-info.readthedocs.io/en/latest/manifest.html?highlight=manifest). But the format changes were not documented - just codified.

That being said. There really isn't much point in looking in the manifest.json anyway - there is no reference to the contained usfm that matches the tCore repo name (e.g.)

Here is the naming spec for repos on Door43: https://git.door43.org/unfoldingWord/registry

Suggestions for url validations going forward:

if the user enters an url that points directly to a usfm file, then we check if the usfm file exists. If so we will use it. (this behavior is the same as tCore) when user enters an url that points to a repo and that repo does not contain a manifest.yaml then we check if the repo contains a usfm file that matches the repo name. If so we will use the URL that points to the usfm file itself. Then at GWE run-time when we see that the user selected an usfm url:

if it is not loadable we show the usual "content is not found" message. and since the USFM is the wild west, we should check the header to make sure there at least is an \id field with a valid book code. then in GWE we only show the content when they have navigated to the book that matches the id in the usfm file. — Reply to this email directly, view it on GitHub https://github.com/unfoldingWord/gateway-edit/issues/342#issuecomment-1718213017, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAG6ND3BJJZSN4WPUO7PQLLX2ID23ANCNFSM53WYSXJQ. You are receiving this because you were mentioned.

theNerd247 commented 1 year ago

I talked to Benjamin from the content team today about the need for importing from repos created in tCore. He's not aware of any immediate need for it but has started a thread with the content team on Zulip.

@birchamp Could you give more insight to why this feature is needed? I'd like to understand that first before starting the work on backwards compatibility (per @PhotoNomad0's comment above)

theNerd247 commented 1 year ago

After having more dialog with the content team I am discovering that the "real" issue is the synchronization of work between tC and gateway-edit in git. The mechanisms used by tCore are potentially outdated? I'm aware that this issue has a larger history that I'm not aware of and it seems that this issue is either a small "hack" to resolve a pressing problem or a small step towards migrating users away from tCore.

I'm wondering if it would be worth having a discussion about which features our users are still using in tCore and if we might better spend our time porting those features over to gateway edit. Some of these features includes:

@birchamp Could you provide some direction regarding this?

birchamp commented 1 year ago

@theNerd247 It was in a content meeting they said that when someone makes changes in tC it would be good to be able to see them in gE. But I'm wondering if as soon as gE gets scripture merging if this will be needed at all. I looked through the chat on Zulip and I think that the same thing applies we need to get editing and merging working and then see if they still want the feature.

birchamp commented 1 year ago

@theNerd247 I'm deprioritizing this for now.

PhotoNomad0 commented 10 months ago

@birchamp In the TOT meeting with content team (refer to the recording), it was mentioned several times it would help if they could view other bibles in the GWE as they are working and not have to open a different tool.

PhotoNomad0 commented 10 months ago

Notes:

richmahn commented 10 months ago

@PhotoNomad0 the metadataType property of an API repo or catalog entry object will tell you what it is: rc, tc, sb or ts

richmahn commented 10 months ago

metadataVersion also may help

richmahn commented 10 months ago

@birchamp @theNerd247 We should probably meet with @richmahn to talk through the best way to determine if it is a tCore project. Note in the example given there are two USFM files in the root folder. The one to use is en_kjv_eph_book.usfm, which follows the same naming convention as the repo name. But we probably should use the manifest file to determine that it is a tCore project.

DCS determines metadataType by unmarshalling the manifest or metadata file into various Go structs. When it gets one that matches, it knows metadata type and vision and gives that to you from the API.

richmahn commented 10 months ago

Make sure to also query the repos available with metadataType= (multiple are ordered) for what your app supports.

https://git.door43.org/api/v1/repos/search?metadataType=rc&metadataType=tc etc.