bible-technology / scripture-burrito

Scripture Burrito Schema & Docs 🌯
http://docs.burrito.bible/
MIT License
21 stars 13 forks source link

Recipe Specs #213

Closed jonathanrobie closed 3 years ago

jonathanrobie commented 4 years ago

This use case focuses on what Sean intends to do in DBL with respect to our current Recipe Spec.

For DBL, the purpose of recipe spec is to give meaningful logs that tell people what transformations have been applied by what processors, not to provide a way to recreate it. It is for reporting, not for interchange, so that a publisher can identify the source and what has been done to it. Paratext would not need to know about these to provide a text.

Order is significant and must be preserved.

The following would be used to filter a burrito to remove publications not available under a given license, strip footnotes or other material from a given publication if it is not available under the license.

Under the current specification, this is what a recipe spec would look like:

{
  "variantId" : "<sha hash>" # (provided by processor)
  "processor" :  "https://thedigitalbiblelibrary.org/apply_license_agreement"
  "data" :  {"license_agreement": "https://thedigitalbiblelibrary.org/license_agreement/nnnn"}
}

For DBL, algorithmFormat and algorithm are not public data and would not be exchanged in burritos.

DBL would not expect Paratext or other scripture editing tools to provide recipe specs.

smorrison commented 4 years ago

Open questions for others in working group:

jonathanrobie commented 4 years ago

Would we be better off with an authority / name scheme, as in URNs?

smorrison commented 4 years ago

Alternative form for the snippet provided by Jonathan:

{
  "operation": "urn:sb:dbl:apply_license_agreement",
  "data":  {
    "license_agreement": "https://thedigitalbiblelibrary.org/license_agreement/nnnn" }
}
jonathanrobie commented 4 years ago

One more question: Should this be done via an extensibility mechanism? Will each processor have a potential need for logging data relevant to it in a format that makes sense to it?

If DBL decides to add to the information it reports, does it need to change the SB specification?

Is there an interoperability expectation among processors here?

jag3773 commented 4 years ago

Effectively, we'd like to do the following:

  1. Remove RecipeSpecs from the spec
  2. Modify Recipes (which only show up in variants) to take the following form:
recipe: [
  {
  "idServer": "dbl",
  "operation": "apply_license_agreement",
  "data":  {
      "license_agreement": "https://thedigitalbiblelibrary.org/license_agreement/nnnn" }
  }
]
  1. Possibly consider the confusion from "variant: source" and relationships expressing the actual source.
jtauber commented 3 years ago

I'm not actually sure how to map an existing recipe like the following into the form you suggest above:

  "recipe": {
    "spec": {
      "path": "recipe_specs/bitsOfBurrito.recipe_spec.json"
    },
    "content": [
      {
        "type": "section",
        "nameId": "ot",
        "content": [
          {
            "type": "element",
            "nameId": "intot",
            "ingredient": "release/text/USX_1/OTINT.usx"
          },
          {
            "type": "element",
            "nameId": "book-gen",
            "ingredient": "release/text/USX_1/GEN.usx"
          },
          {
            "type": "element",
            "nameId": "book-exo",
            "ingredient": "release/text/USX_1/EXO.usx"
          },
          {
            "type": "element",
            "nameId": "book-lev",
            "ingredient": "release/text/USX_1/LEV.usx"
          }
        ]
      },
      {
        "type": "section",
        "nameId": "nt",
        "content": [
          {
            "type": "element",
            "nameId": "intnt",
            "ingredient": "release/text/USX_1/NTINT.usx"
          },
          {
            "type": "element",
            "nameId": "intmat",
            "ingredient": "release/text/USX_1/INTMAT.usx"
          },
          {
            "type": "element",
            "nameId": "book-mat",
            "ingredient": "release/text/USX_1/MAT.usx"
          }
        ]
      }
    ]
  }
jag3773 commented 3 years ago

Supersedes #181 and #157

jag3773 commented 3 years ago

Suggestion based on our discussion today for point 3 above is to rewrite https://github.com/bible-technology/scripture-burrito/blob/develop/schema/metadata.schema.json as a simple 3 value enum called category.

If we want extra information about the derived category that can be encoded in the recipe section, especially under data.

jtauber commented 3 years ago

Renamed variant to category and fixed as enum in 0f3bef23a897b69ff4241e88535dc45a05d2793a