Define recipeSpecs for source

FoolRunning commented 4 years ago

Now that I've seen what the recipe spec entails, I have a couple concerns:

The implementation of the recipe spec, by itself, looks like it could double the effort needed to implement all of SB (or more depending on what the client needs to do for UI to support it).
The main use-case for SB is for transferring data between applications, not for publishing or long-term storage (although the transfer could be to another application for the purpose of publishing). Thus, the recipe spec seems mostly (only?) useful for DBL.

So, I'd like to propose that recipe specs are dropped from SB, or pushed to an application-specific section of SB so other implementations of SB do not need to handle it.

P.S. I can't seem to add labels to issues, but this probably needs the Talk About This! label.

mvahowe commented 4 years ago

@FoolRunning I have no idea why you can't add tags, unless it's a permissions thing. We should talk about this!

I feel like I've spent most of the last month talking about how PT wants to do the sort of thing that is made possible by recipeSpecs. Also, without recipeSpecs, there are no variants.

jag3773 commented 4 years ago

Fixed the label issue @FoolRunning.

I don't think that the detail supported by the recipeSpecs is a requirement for all SB implementers to support. They only need to support whatever they intend to use and we plan to have well known recipes for common needs.

mvahowe commented 4 years ago

Right - recipeSpecs are optional and, if present, anyone not planning to run them can ignore them. Also, recipeSpecs do not (currently) exist in derived variants which are what would typically be distributed. If you don't distribute your source, no-one can know for sure how you made the recipes in derived variants. (A Schrodinger's recipeSpec, if you like).

rdb commented 4 years ago

For the record, I also share the concerns raised by @FoolRunning. I think inventing our own programming language to implement recipe specs will have two undesirable effects:

It will be impossible for someone who's not experienced in programming with LISP to create a recipe spec.
It will be very difficult to write an implementation that supports SB recipe specs.

I don't really have a good view of all the use cases that we need to support with recipe specs, so I wouldn't really consider myself qualified to write an alternative. But if I were to try, I'd start with seeing if we can get away with a flat(-ish), declarative structure with a series of conditionals and built-in transformations.

mvahowe commented 4 years ago

I think what I'm proposing right now is pretty much "a flat(-ish), declarative structure with a series of conditionals and built-in transformations". I think that the only assignments in my examples are building arrays which could be done by mapping.

An alternative that I'd like to discuss is writing the recipeSpecs in Clojure[script]. That's a language that exists, it has very clean semantics, and it is often used to create DSLs. Clojurescript compiles to JS, while Clojure plays nicely with Java/.Net, so there ought to be a way to run it on most systems.

On

It will be very difficult to write an implementation that supports SB recipe specs.

we need to keep reminding ourselves that this is a hard problem, and that it's currently impossible to do this, ie there is zero portability of Scripture processing models anywhere in the world to the best of my knowledge. So, in the worst case, we can't make anything worse.

Here's what the current PT solution to part of this problem looks like (from dblChanges.txt which is defined on a per-entry basis). So global regexes to edit XML blind. DBL will definitely not be doing this after Q3 2021 so, if this functionality is needed, we need a better solution.

# "\\w (.*?)\|.*?\\w\*" > "\\w \1\\w*"
"<<\s*<"               >    "\u201c\u2009\u2018" # Use nested open double and single curly quotes
"<<"                   >    "\u201c"             # Use double open curly quotes
"<"                    >    "\u2018"             # Use single open curly quotes
">\s*>>"               >    "\u2019\u2009\u201d" # Use nested close double and single curly quotes
">>"                   >    "\u201d"             # Use double close curly quotes
">"                    >    "\u2019"             # Use single close curly quotes

A solution for the easy cases only doesn't really buy us anything:

If our recipeSpecs can't do everything needed for DBL, DBL will need to find its own solution, fork the SB spec if necessary, and then require everyone using DBL to work with that proprietary spec.
Only yesterday I was in a conversation about how PT users currently produce lectionaries - using VBA.

If our proposal doesn't do all of what someone needs, they won't use it for anything, because one solution is almost always easier to manage than multiple case by case solutions.

mvahowe commented 4 years ago

Further to how this is a hard problem, W3C has a declarative pipeline spec. Here's where they give in and allow it to run arbitrary shell commands. (Last time I looked, xproc was maintained by the creator of docbook, and he found he needed shell commands for his own pipeline!)

https://www.w3.org/TR/xproc/#c.exec

jonathanrobie commented 4 years ago

I agree with @FoolRunning. I don't think any of the Paratext use cases need this. Once we send something to an archive like the DBL, we want to make sure that it round-trips. When we get it back, it should match what we sent. It's an archive. If we send something to another application, it can process it any old way it wants, but that's not our problem, we don't have to create a RecipeSpec for them. I don't want SB implementations to have this level of complexity.

I have been a member of the XQuery/XPath Working group and the XSLT Working Group. These things took years. So did XProc. Their first Working Draft was 28 September 2006. The Recommendation came out 11 May 2010.

Exchanging data among applications should not require this level of complexity.

On Thu, Mar 19, 2020 at 6:04 AM Mark Howe notifications@github.com wrote:

Further to how this is a hard problem, W3C has a declarative pipeline spec. Here's where they give in and allow it to run arbitrary shell commands. (Last time I looked, xproc was maintained by the creator of docbook, and he found he needed shell commands for his own pipeline!)

https://www.w3.org/TR/xproc/#c.exec

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bible-technology/scripture-burrito/issues/157#issuecomment-601092878, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANPTPOW5ZE275WUEVQMQCDRIHU4FANCNFSM4K7ZCQ2A .

mvahowe commented 4 years ago

If and when anyone produces an alternative I'll be happy to consider it. My audio example would be a good place to start. (It's much simpler than, say, "birth narratives".)

mvahowe commented 4 years ago

(Also, I don't think this is the place to declare what DBL is. If you want something like Dropbox, use Dropbox...)

jonathanrobie commented 4 years ago

Here is the stated purpose of Scripture Burrito:

Scripture Burrito is a data interchange format for Bible-centric content.

Our goal is lossless portability of Scripture-related metadata and data between translation and publication users, applications and ecosystems.

Lossless portability does not require a processing model beyond "I send you something, you receive it, there's enough standard metadata for you to understand what you got, the data is in standard formats". Lossless portability does not require transformations.

On Thu, Mar 19, 2020 at 5:49 AM Mark Howe notifications@github.com wrote:

I think what I'm proposing right now is pretty much "a flat(-ish), declarative structure with a series of conditionals and built-in transformations". I think that the only assignments in my examples are building arrays which could be done by mapping.

An alternative that I'd like to discuss is writing the recipeSpecs in Clojure[script]. That's a language that exists, it has very clean semantics, and it is often used to create DSLs. Clojurescript compiles to JS, while Clojure plays nicely with Java/.Net, so there ought to be a way to run it on most systems.

On

It will be very difficult to write an implementation that supports SB recipe specs.

we need to keep reminding ourselves that this is a hard problem, and that it's currently impossible to do this, ie there is zero portability of Scripture processing models anywhere in the world to the best of my knowledge. So, in the worst case, we can't make anything worse.

Here's what the current PT solution to part of this problem looks like (from dblChanges.txt which is defined on a per-entry basis). So global regexes to edit XML blind. DBL will definitely not be doing this after Q3 2021 so, if this functionality is needed, we need a better solution.

"\w (.?)|.?\w*" > "\w \1\w*"

"<<\s<" > "\u201c\u2009\u2018" # Use nested open double and single curly quotes "<<" > "\u201c" # Use double open curly quotes "<" > "\u2018" # Use single open curly quotes ">\s>>" > "\u2019\u2009\u201d" # Use nested close double and single curly quotes ">>" > "\u201d" # Use double close curly quotes ">" > "\u2019" # Use single close curly quotes

A solution for the easy cases only doesn't really buy us anything. If our recipeSpecs can't do everything needed for DBL, DBL will need to find its own solution, fork the SB spec if necessary, and then require everyone using DBL to work with that proprietary spec. Only yesterday I was in a conversation about how PT users currently produce lectionaries - using VBA.

If our proposal doesn't do all of what someone needs, they won't use it for anything, because one solution is almost always easier to manage than multiple case by case solutions.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bible-technology/scripture-burrito/issues/157#issuecomment-601086264, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANPTPLFDRPVAUOYV2EPL3DRIHTC3ANCNFSM4K7ZCQ2A .

mvahowe commented 4 years ago

That's fine, but I don't see DBL supporting variants that it cannot produce itself.

jonathanrobie commented 4 years ago

Scripture Burrito is not DBL. Scripture Burrito is a data-interchange format. The various applications that use it to exchange data will process this data in very different ways.

This Working Group is not the place to specify what DBL supports or does not support. That's between DBL and its users.

On Thu, Mar 19, 2020 at 7:33 AM Mark Howe notifications@github.com wrote:

That's fine, but I don't see DBL supporting variants that it cannot produce itself.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bible-technology/scripture-burrito/issues/157#issuecomment-601130238, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANPTPNQE4MTFCFD5LNSXVDRIH7IHANCNFSM4K7ZCQ2A .

mvahowe commented 4 years ago

So, to be clear, PT doesn't need dblChanges.txt to work?

mvahowe commented 4 years ago

(My earlier response was to

an archive like the DBL

). If this is the place to make that kind of statement, I'm going to start referring to

a wordpad clone like Paratext

jonathanrobie commented 4 years ago

I have no idea what that file is, but I think the basic design principle is this: Scripture Burrito is about exchanging data. If this is data that only Paratext uses and nobody else will, it probably should be handled via extensibility and not as a core part of the specification.

Here is what we signed up for:

https://github.com/bible-technology/scripture-burrito/blob/develop/docs/introduction/overview.rst

There's nothing about a processing model or transformations in that.

On Thu, Mar 19, 2020 at 7:40 AM Mark Howe notifications@github.com wrote:

So, to be clear, PT doesn't need dblChanges.txt to work?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bible-technology/scripture-burrito/issues/157#issuecomment-601132699, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANPTPIIZLZKFNIIB6LSSLTRIIACVANCNFSM4K7ZCQ2A .

mvahowe commented 4 years ago

So "0.2 is not 1.0" except when it is?

jonathanrobie commented 4 years ago

Let's leave DBL and Paratext out of this. Here's a requirement we signed up for:

Scripture Burrito is intended to allow lossless roundtripping of projects

between ecosystems. This depends to some extent on references to ecosystem servers that enable reconnection with different ecosystem-specific contexts.

On Thu, Mar 19, 2020 at 7:48 AM Mark Howe notifications@github.com wrote:

(My earlier response was to

an archive like the DBL

). If this is the place to make that kind of statement, I'm going to start referring to

a wordpad clone like Paratext

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bible-technology/scripture-burrito/issues/157#issuecomment-601135431, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANPTPPFXZY2WU54EIJCLO3RIIA7XANCNFSM4K7ZCQ2A .

mvahowe commented 4 years ago

You're quoting a statement that I wrote, describing a meeting that you didn't attend, that was never intended to be a binding contract. This is getting ridiculous so I'm going to stop.

mvahowe commented 4 years ago

And, also, you've repeatedly argued that lossless roundtripping is impossible, and yet here you are...

jonathanrobie commented 4 years ago

If the overview page, scope, concepts, and use cases do not describe what we are trying to accomplish, we should clarify the problem we are trying to solve before we worry too much about the details. That should be described clearly enough so that people who were not at the original meeting can understand it. In fact, I think some of that text was written after the initial meeting because I asked for this kind of clarity.

I think we are discussing whether we should expand the scope. How do you think the scope and use cases should be modified?

XProc defines a pipeline. I assume you aren't asking us to add processing pipelines to a data-interchange format. In general, I think a data-interchange format is more likely to be helpful when it makes few assumptions about the pipelines that applications will use to process the data.

On Thu, Mar 19, 2020 at 7:53 AM Mark Howe notifications@github.com wrote:

You're quoting a statement that I wrote, describing a meeting that you didn't attend, that was never intended to be a binding contract. This is getting ridiculous so I'm going to stop.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bible-technology/scripture-burrito/issues/157#issuecomment-601137167, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANPTPMCWX4AV3KEDU6U6GLRIIBSHANCNFSM4K7ZCQ2A .

mvahowe commented 4 years ago

I'm not asking for anything. We agreed to proceed with variants as a better alternative for publications, and I think recipeSpecs are a corollary of that. But I'm happy to close my PR and remove the recipeSpec section from source, and then each vendor will decide how to fill that hole (in terms of technology and policy).

FoolRunning commented 4 years ago

So global regexes to edit XML blind. So, to be clear, PT doesn't need dblChanges.txt to work?

DBLChanges.txt is run on the USFM before uploading to DBL. After it gets to DBL, Paratext has no other use for it. So I don't think the abilities of DBLChanges need to be part of SB in any way.

mvahowe commented 4 years ago

Ok, so DBL won't be storing this and that's ok?

FoolRunning commented 4 years ago

Ok, so DBL won't be storing this and that's ok?

Yes, that should be perfectly fine.

jonathanrobie commented 4 years ago

When I get a burrito, I don't expect the wrapper to tell me how to make it or how to eat it, I expect the wrapper to tell me what is in it.

On Thu, Mar 19, 2020 at 8:53 AM Tim Steenwyk notifications@github.com wrote:

Ok, so DBL won't be storing this and that's ok?

Yes, that should be perfectly fine.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bible-technology/scripture-burrito/issues/157#issuecomment-601160901, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANPTPJDJVGH4SSJ2RVOVNDRIIITVANCNFSM4K7ZCQ2A .

mvahowe commented 4 years ago

If I'm going to sell food to someone else, and the person who made it won't tell me how they made it, I may well decide not to sell it for them.

mvahowe commented 4 years ago

Otherwise this happens :-) https://www.bbc.com/news/world-us-canada-45948986

jonathanrobie commented 4 years ago

The benefit of a standard is interoperability among applications. We want to keep the cost of the standard low and the benefit high.

Do we have multiple implementations that need this for interoperability? If we do, we need to demonstrate that we have multiple implementations that do interoperate, which means we need to be sure that it is fully specified and can be used for interoperability. If this is something that will be done within one application, it is not used for data-interchange among applications, and is not needed in a data-interchange specification.

Based on my current understanding, Paratext will not use RecipeSpecs. We do not need this for interoperability with any application.

On Thu, Mar 19, 2020 at 9:54 AM Mark Howe notifications@github.com wrote:

If I'm going to sell food to someone else, and the person who made it won't tell me how they made it, I may well decide not to sell it for them.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bible-technology/scripture-burrito/issues/157#issuecomment-601191919, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANPTPNJDFAV62H6ONH2BTDRIIP2TANCNFSM4K7ZCQ2A .

mvahowe commented 4 years ago

I'm not sure who you're debating with. I'm going to suggest that we make recipeSpecs out of scope and that DBL and everyone else does their own thing (technology and policies). That makes my life way easier.

jag3773 commented 4 years ago

Our discussion today left us thinking that a more generic system that allows for this same sort of functionality would be helpful. @mvahowe is planning to work in this direction and we can discuss it again next week.

mvahowe commented 4 years ago

The consensus of today's meeting was

Inventing a new programming language from scratch is a bit daunting but
Having potential interoperability of recipeSpecs could be cool and useful
Since my markup has ended up pretty much reinventing Lisp, and since several people are interested in clojurescript, it would make sense to start with a clojurescript DSL and then potentially work back to a JSON representation of that closurescript. This means we don't need to define all the fine detail as the answer is "See the clojurescript spec".

Not everyone will want to use this, and not everyone needs to use this, so the main metadata should allow for

an opaque, vendor-specific name for the processor
a hyperlink to an ingredient containing the recipeSpec, with whitelisted recipeSpec types plus x- functionality
maybe a way to link to a processor in npm or elsewhere.

I'll make a new issue to decide on the details of the main metadata for this.

jonathanrobie commented 4 years ago

This addresses my biggest concerns.

Is it possible for a given Burrito to contain both (1) a recipeSpec and (2) the instantiated variant that the recipeSpec produces? I would prefer it each publication would be specified with one or the other but not both. If we do allow both, I would like us to specify which one is authoritative so we have a single source of truth.

mvahowe commented 4 years ago

Sources have optional recipeSpecs. Derived variants have precisely one instantiated recipe.

mvahowe commented 4 years ago

There's a PR offering one solution to this at #175

bible-technology / scripture-burrito

Define recipeSpecs for source #157

"\w (.?)|.?\w" > "\w \1\w"