bible-technology / scripture-burrito

Scripture Burrito Schema & Docs 🌯
http://docs.burrito.bible/
MIT License
21 stars 13 forks source link

Use Cases: Distinguish Project Repository from Resource Repository? #210

Closed jonathanrobie closed 3 years ago

jonathanrobie commented 4 years ago

This is a general issue for how we write and think about use cases and requirements.

To me, the lack of a distinction between projects and resources is problematic. A resource needs to be able to identify the latest authoritative copy of a project, which may be in various systems. This project may change at a different rate than the resource.

In our existing systems, the project repository is distinct from the resource repository.

For Paratext, I think it is helpful to distinguish the client from the central project repository. The Paratext client will always use distributed source code control to communicate with the project repository. SB is useful for two kinds of exchange:

  1. Exchanging projects with other project repositories or editors (import / export a project) - as a project
  2. Uploading the latest revision of a project to the resource repository (DBL) - as a resource

To know where to find the latest authoritative project text for a resource, we need to know which project server owns it. This is equally true for another editing client, a parallel project repository, or a resource repository. The expectations for what another system does with a project may also be significantly different from the expectations for what another system does with a published resource.

So I propose that we introduce terms like 'project repository' and 'resource repository' and consider expanding our metadata accordingly.

smorrison commented 4 years ago

Reading through the SB spec and examples, I think we can model this now,

e.g. When the project is being handled internally to PT, the document can describe both the project and revision (sha hash of the current commit?). When the project gets turned into a resource (i.e. "compiled" to USX for distribution), the current paratext project(id, revision) is captured and a recipe spec ("convert to USX") is applied to turn the ingredients into USX.

DBL would then receive a burrito whose contents are USX, and would retain the paratext(id, revision), but apply a dbl label with our own (id, revision) pair.

When DBL further processes for a licensed publisher, it would add further recipes to the list, but not remove the current one.

The full history of processing would be captured in the list of recipes, and the ultimate source of the project would be captured by the original identification section

FoolRunning commented 4 years ago

I think we need to talk about this as a group. I don't understand why the distinction matters.

jag3773 commented 3 years ago

Related to #214 #213 #222 #75

jtauber commented 3 years ago

Just adding some points in response to Sean's comment (in light of more recent changes):

"the identification section can identify very precisely the owning server and how they refer to the SB."

As we're currently talking about it, it only identifies how a party refers to the SB. There is no notion of an "owning server".

"using recipeSpecs and recipes, we can model how a resource is derived from a source project."

We've abandoned recipeSpecs as formal models of derivation and are now just considering having human-readable documentation about the derivation.

jonathanrobie commented 3 years ago

Let me think in terms of the Autographa use case. Someone creates a SB using Autographa and uploads it to both the Paratext ecosystem and the unfoldingWord ecosystem. It has a unique identifier created by Autographa.

This resource can be edited in both ecosystems, someone can forget to check it in, you could imagine two different teams getting out of sync editing the same resource and creating branches without realizing they are doing that. Is that OK? Is there anything we can do about it?

jonathanrobie commented 3 years ago

In the Paratext world, if we know that the source burrito for a derived DBL resource has a Paratext identifier, we probably have everything we need, don't we?