mvahowe commented 4 years ago

I asked Elbert Boot of DBL to describe the system he currently provides to produce lectionaries from Paratext. This is a real, current, use case, so we should think about whether "a few booleans" is going to be able to address it. It's also a description of an actual working solution, which I think trumps any number of "As a user I want a pony" stories.

AN ADVANCED LECTIONARY GENERATOR

This tool is a cooperation between MS Excel, MS Word and Paratext.

Main source: MS Excel

In an EXCEL sheet an eternal calendar is built in with the next elements:

A variable year (2020, 2021, 2022)
All holy days and national holy days are mentioned and calculated in Excel
All fixed readings are in a separate sheet and connected to a holy day
All variable readings are in a separate sheet and can be filled in within his sheet
Scripture readings can be filled using the scripture references settings as used in the Paratext project. A function converts the the scripture reading into machine/Paratext readable content
Names of the holy days as well as all day names and dates are in local language (with help of a code in Excel)
Fixed readings can be split (for example to handle a three years lectionary).

Generator tool I: MS Word

In MS Word a template has been created with all needed elements organized per day. This template is feed by the MS Excel sheet. All USFM markers are put into the right position. With help of the tab "Mailings" in MS Word a full year can be produced. The output is a for 100% prepared MSWord document of 365 or 366 pages and contains all info per day.

Generator tool II: Lectionary module in Paratext (XXA, XXB)

In Paratext the lectionary module is put on and the results of tool I have to be pasted into one of the extra bible books. Paratext keeps all prepared non-scripture reading data and with help of the machine readable USFM markup the Scripture readings are collected from the USFM files. The results can be checked by putting on "output" in the lectionary module. Now the entire lectonary is available from Paratext.

Considerations:

Possibilities of using MSWord/MSExcel combined with Paratext are limitless.
MSWord/MsExcel use VBA; maybe we can integrate py here
If we base the lectionary generator (tool II) on USX-3, the lectionary generator can be connected to DBL.
All Excel content can be protected and hidden. So only all variable info can be edited by users.
There are different calenders: for the Western Church and the Eastern Church based on a difference in the calculation of the Eastern date. These differences can be handled in the Excel sheets (and underlying calculations)

jonathanrobie commented 4 years ago

Excellent!

So what is the role of Scripture Burrito in this use case? Which components exchange data using Scripture Burrito, and what do they need to have in the burritos? At first glance, it looks like the current system would create a lectionary in Paratext and the lectionary would then be packaged as a burrito, then uploaded to DBL. It sounds like you would like him to be able to do this processing directly in DBL and create a burrito that contains it on the server. Is that about right?

On Fri, Mar 20, 2020 at 9:45 AM Mark Howe notifications@github.com wrote:

I asked Elbert Boot of DBL to describe the system he currently provides to produce lectionaries from Paratext. This is a real, current, use case, so we should think about whether "a few booleans" is going to be able to address it. It's also a description of an actual working solution, which I think trumps any number of "As a user I want a pony" stories. AN ADVANCED LECTIONARY GENERATOR

This tool is a cooperation between MS Excel, MS Word and Paratext. Main source: MS Excel

In an EXCEL sheet an eternal calendar is built in with the next elements:

A variable year (2020, 2021, 2022)

All holy days and national holy days are mentioned and calculated in Excel

All fixed readings are in a separate sheet and connected to a holy day

All variable readings are in a separate sheet and can be filled in within his sheet

Scripture readings can be filled using the scripture references settings as used in the Paratext project. A function converts the the scripture reading into machine/Paratext readable content

Names of the holy days as well as all day names and dates are in local language (with help of a code in Excel)

Fixed readings can be split (for example to handle a three years lectionary).

Generator tool I: MS Word

In MS Word a template has been created with all needed elements organized per day. This template is feed by the MS Excel sheet. All USFM markers are put into the right position. With help of the tab "Mailings" in MS Word a full year can be produced. The output is a for 100% prepared MSWord document of 365 or 366 pages and contains all info per day. Generator tool II: Lectionary module in Paratext (XXA, XXB)

In Paratext the lectionary module is put on and the results of tool I have to be pasted into one of the extra bible books. Paratext keeps all prepared non-scripture reading data and with help of the machine readable USFM markup the Scripture readings are collected from the USFM files. The results can be checked by putting on "output" in the lectionary module. Now the entire lectonary is available from Paratext. Considerations:

Possibilities of using MSWord/MSExcel combined with Paratext are limitless.

MSWord/MsExcel use VBA; maybe we can integrate py here

If we base the lectionary generator (tool II) on USX-3, the lectionary generator can be connected to DBL.

All Excel content can be protected and hidden. So only all variable info can be edited by users.

There are different calenders: for the Western Church and the Eastern Church based on a difference in the calculation of the Eastern date. These differences can be handled in the Excel sheets (and underlying calculations)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bible-technology/scripture-burrito/issues/168, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANPTPNKMQNQFUAMIMC3ZELRINXONANCNFSM4LQMHECA .

mvahowe commented 4 years ago

This is surely what used to be called an incremental publishing scenario, which we currently expect to implement as a derived variant, which are specified using recipeSpecs. So the way we define recipeSpecs needs to be able to handle at least this level of complexity.

jonathanrobie commented 4 years ago

I don't yet understand why.

If this were produced in Paratext, I would expect to provide the instantiated variant rather than a recipeSpec. Unless I have some kind of contract with the system that consumes it, I do not know when and how the recipeSpec will be used. If the variant is created outside of Paratext, Paratext probably doesn't know anything about how it is created and would not have the recipeSpec.

But you may be thinking differently. That's why I asked the questions I did. In this Working Group, it's probably helpful to focus on the role of Scripture Burrito, how the data is exchanged, and what expectations producers and consumers can have when exchanging this data.

So what is the role of Scripture Burrito in this use case? Which components exchange data using Scripture Burrito, and what do they need to have in the burritos? At first glance, it looks like the current system would create a lectionary in Paratext and the lectionary would then be packaged as a burrito, then uploaded to DBL. It sounds like you would like him to be able to do this processing directly in DBL and create a burrito that contains it on the server. Is that about right?

On Fri, Mar 20, 2020 at 10:15 AM Mark Howe notifications@github.com wrote:

This is surely what used to be called an incremental publishing scenario, which we currently expect to implement as a derived variant, which are specified using recipeSpecs. So the way we define recipeSpecs needs to be able to handle at least this level of complexity.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bible-technology/scripture-burrito/issues/168#issuecomment-601722467, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANPTPPEMJGKEE6RT2VUO4LRIN265ANCNFSM4LQMHECA .

mvahowe commented 4 years ago

recipeSpecs are the way to turn a source into a derived variant. In this case, the derived variant would be the lectionary.

The deal with derived variants is that they are produced directly from the source. I don't see any way for someone receiving a derived variant to know whether or not this is the case, other than to trust the person sending it.

An additional layer of constraint is that sources can be updated and that derived variants can potentially be produced from the newer revision of that source. Consumers are going to expect revision 3 of derived_x to be an actual revision of derived_x rather than something else that happens to have the same name or, say, a lectionary produced a completely different way to a completely different specification. Again, I see no tractable way to verify this from instantiated recipes.

I don't think that servers should trust clients. The proposed SB trust model is between idServers.

So my working assumption is that idServers may well not accept derived variants directly. Instead, they may accept the source plus recipeSpecs for producing the derived variant. That's a policy decision. I will say that the reference implementation I'm building supports this scenario in its config file settings.

mvahowe commented 4 years ago

idServers might also choose to accept derived entries that are owned by another idServer (because they trust that idServer). That trust would presumably need to be earned in terms of the process of the other idServer. "A user clicked on some buttons" may not be sufficient to earn that trust.

mvahowe commented 4 years ago

Alternatively, people could ignore the whole variants thing and just make new entries for everything. Again, idServers will presumably set policy about the kind of entries they are willing to accept. For example, a hypothetical idServer claiming to be a Bible Library might not accept lectionaries that were not demonstrably a side effect of an ongoing translation project.

mvahowe commented 4 years ago

The other answer is simply "Do we think users should need to resort to Excel spreadsheet macros to do Bible publication?" It probably isn't realistic to expect everyone to use a standard set of tools, but it seems strange not to at least consider what a standard set of tools could do for us.

jonathanrobie commented 4 years ago

That's one way to do it. In your scenario, the server does not have to trust the client, but I'm also not sure what the role of the client is. If you want the server to be the source of truth, why not do that work on the server and provide an environment for viewing and validating? If you trust the client enough to let the client create the publications, why not go with what the user saw and approved? If the client does not create the publications, it does not provide them in the Scripture Burrito.

For derived document types created in Paratext, it would probably be the other way around. When a user sees and approves a derived document created in our environment, we have what we need for the variant. The user has already seen and approved the document we have, and here is no particular value in trusting a server to recreate the same thing. In general, the environment that creates a derived document probably needs a way for a user to know if it is correct or not.

On Fri, Mar 20, 2020 at 12:43 PM Mark Howe notifications@github.com wrote:

recipeSpecs are the way to turn a source into a derived variant. In this case, the derived variant would be the lectionary.

The deal with derived variants is that they are produced directly from the source. I don't see any way for someone receiving a derived variant to know whether or not this is the case, other than to trust the person sending it.

An additional layer of constraint is that sources can be updated and that derived variants can potentially be produced from the newer revision of that source. Consumers are going to expect revision 3 of derived_x to be an actual revision of derived_x rather than something else that happens to have the same name or, say, a lectionary produced a completely different way to a completely different specification. Again, I see no tractable way to verify this from instantiated recipes.

I don't think that servers should trust clients. The proposed SB trust model is between idServers.

So my working assumption is that idServers may well not accept derived variants directly. Instead, they may accept the source plus recipeSpecs for producing the derived variant. That's a policy decision. I will say that the reference implementation I'm building supports this scenario in its config file settings.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bible-technology/scripture-burrito/issues/168#issuecomment-601799275, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANPTPLLPNVAKTHZIRNKJK3RIOML5ANCNFSM4LQMHECA .

mvahowe commented 4 years ago

In your scenario, the server does not have to trust the client, but I'm also not sure what the role of the client is. If you want the server to be the source of truth, why not do that work on the server and provide an environment for viewing and validating?

If we were talking about DBL which, apparently, we're not, I would have said that "doing the work on the server" is precisely how I see this working. I also don't think there's any reason not to do the work necessary for a preview on the client too, assuming we have portable recipeSpecs that can deal with real scenarios like the lectionary one above.

If you trust the client enough to let the client create the publications

I don't think we should trust clients, period. Servers should not rely on client-side checks, ever. In the case of source uploads, that's fixable. In the case of variants based on a source, I don't see any tractable way for a server to prove that a derived variant is actually derived from the source. Do you see a way to do that? If not, I think expecting the server to claim that this is true is a non-starter.

The user has already seen and approved the document we have, and here is no particular value in trusting a server to recreate the same thing.

There's value if you want that server to claim on your behalf that it's the same thing. Otherwise, I think the best the server can do is say is "Here's a thing". It's a bit like me handing over a box to a jeweller, claiming that it contains a ring that is made of 24 carat gold, but the box is sealed, but I want the jeweller to claim, for me, that the box contains a ring made of 24 carat gold.

mvahowe commented 4 years ago

If we were talking about DBL which, apparently, we're not, I'd point out that we're moving into a scenario where text is uploaded by many different clients and, in that scenario, no client gets special treatment. So Paratext (if we were talking about that, which we're not) would have to jump through the same hoops as some client lashed together by a script kiddie last week.

jonathanrobie commented 4 years ago

On Fri, Mar 20, 2020 at 3:12 PM Mark Howe notifications@github.com wrote:

If you trust the client enough to let the client create the publications

I don't think we should trust clients, period. Servers should not rely on client-side checks, ever. In the case of source uploads, that's fixable. In the case of variants based on a source, I don't see any tractable way for a server to prove that a derived variant is actually derived from the source. Do you see a way to do that? If not, I think expecting the server to claim that this is true is a non-starter.

If the server does not trust the client, why is the server willing to run code on the client's behalf? Or is the recipeSpec provided by the server? If the server does not trust the client to do the proper client-side checks, how does the server understand the semantics of what the client intended to do better than the client did? How does the recipeSpec get written?

And above all, how does any of this this relate to the use case in this thread?

It might be helpful to imagine a new program, LectionaryBuilder, that imports Scripture Burritos that contain translations and exports Scripture Burritos that contain lectionaries. Because we are focusing on Scripture Burrito, we can ignore user interface questions and internal processing questions - somehow, this program needs to replicate the functionality Elbert has in his spreadsheets, bring in translation texts, show him sample lectionaries so he can make sure they are right, etc.

For Scripture Burrito, though, the real questions are about what we put in a burrito.

What is the format for the files in a lectionary?
What is the right flavor? What metadata is needed?

You seem to prefer a solution that involves recipeSpecs. I don't yet understand how that solution would work. Regardless, I think that LectionaryBuilder should be allowed to provide an instantiated instance without a recipeSpec. In fact, it might be difficult to provide code equivalent to the internal code used to create the lectionary, which might be in several languages and come from different systems, as it currently does in Elbert's system.

The user has already seen and approved the document we have, and here is no particular value in trusting a server to recreate the same thing.

There's value if you want that server to claim on your behalf that it's the same thing. Otherwise, I think the best the server can do is say is "Here's a thing". It's a bit like me handing over a box to a jeweller, claiming that it contains a ring that is made of 24 carat gold, but the box is sealed, but I want the jeweller to claim, for me, that the box contains a ring made of 24 carat gold.

I really don't understand the analogy. The lectionary is data. You can look at it and see if it is a lectionary. It's much easier to inspect data than it is to inspect code, and you can validate that data any way you want.

How would the server validate a recipeSpec? If I provide code to create a lectionary, how does a server validate that code?

Who creates the code for a recipeSpec and how? How does the server decide if this is code it wants to run?

mvahowe commented 4 years ago

If the server does not trust the client, why is the server willing to run code on the client's behalf?

That's precisely why I think we need a sandboxed recipeSpec model. (I can't imagine a hypothetical Bible Libary ever distributing, let alone running third party Javascript for precisely that reason.)

If the server does not trust the client to do the proper client-side checks, how does the server understand the semantics of what the client intended to do better than the client did?

It's impossible to answer that without talking about specifics but, eg, USX/USFM are known standards and it's perfectly feasible for a server to perform best practice checks on USX/USFM it receives from other places. The specific issue in this discussion is verifying that a lectionary that claims to be a variant of entryX/revisionY is actually a variant of entryX/revisionY, and that seems hard if not impossible to verify server-side.

How does the recipeSpec get written?

The same way anything else gets written. Someone could write it manually, or some tool could generate it. I imagine that, in most cases, a tool would generate it, and it's likely that most tools would choose to use a subset of the functionality. Before we get into how we can't expect a UI to generate executable code, it seems to have worked out ok for PDF, which is based on Postscript, which is a Turing-complete language (where most printer drivers use a small subset of the full functionality of Postscript.)

And above all, how does any of this this relate to the use case in this thread?

I don't know how else I can explain it. If a lectionary claims to be a variant of entryX/revisionY, we need to know that this is true. This is especially important when lectionaries and other spin-off publications are part of a translation plan. Trusting clients on this sort of thing was never a great idea but, in a future world where servers accept content from any conforming client, it's a non-starter. And, also, the above, current system demonstrates that when we don't provide the functionality people want, people just do an end-run around our crippled systems and, at that point, any concerns about, say, who is allowed to do what go out the window. (You've talked about controlling who can produce which variants. How do you plan to do that for people who extract the SFM files and do whatever they want with them via Excel?)

I think that LectionaryBuilder should be allowed to provide an instantiated instance without a recipeSpec.

Everything I've written anywhere confirms that LectionaryBuilder can do this. But the fact that LectionaryBuilder can do this does not require any idServer to accept that output from LectionaryBuilder. idServers should be allowed to set their own policies on what they accept. I've explained how I imagine a hypothetical Bible Library answering that question.

In fact, it might be difficult to provide code equivalent to the internal code used to create the lectionary, which might be in several languages and come from different systems, as it currently does in Elbert's system.

(With considerable restraint.) Right, so maybe what we need is a shared language for specifying recipeSpecs? Your argument on this is completely circular.

It's much easier to inspect data than it is to inspect code, and you can validate that data any way you want.

Again, how would you set about demonstrating that an arbitrary publication like a lectionary only uses Scripture content derived from a specific project snapshot, and how would you set about demonstrating that the lectionary as a whole represents the same processing that was used for the previous revision of that lectionary? I think it's hard if not impossible. If you disagree, you need to describe what the possible/not-hard process looks like.

How would the server validate a recipeSpec? If I provide code to create a lectionary, how does a server validate that code?

The server can't tell us whether what is produces is a lectionary, and that's ok. If users of a hypothetical Bible Library want to create recipeSpecs that claim to produce lectionaries but actually write "Happy Birthday" in Hebrew, that's absolutely fine. "lectionary", in this context, is just a user-supplied label.

If we have a tightly-defined recipeSpec language, the server can validate that language and, if it's valid, it can run it. The output may be more or less useful, but it's not going to mail /etc/passwd to the world. That's the level at which the server cares about this code.

And, in terms of output, the server can affirm that this thing called lectionary which may or may not be useful is derived from entryX/revisionY because it just derived it.

jonathanrobie commented 4 years ago

On Sat, Mar 21, 2020 at 8:51 AM Mark Howe notifications@github.com wrote:

If the server does not trust the client, why is the server willing to run code on the client's behalf?

That's precisely why I think we need a sandboxed recipeSpec model. (I can't imagine a hypothetical Bible Libary ever distributing, let alone running third party Javascript for precisely that reason.)

OK, so we may probably need to specify this model for running if we go this route.

I think we can keep the spec simpler and keep the process of producing the spec simpler if we leave this out in the first version. If we had compelling use cases that clearly showed this to be a hard requirement, I would feel differently. If we want to support lectionaries, we might want to start by asking what flavor we would use and what the data would look like.

jonathanrobie commented 4 years ago

On Sat, Mar 21, 2020 at 8:51 AM Mark Howe notifications@github.com wrote:

I don't know how else I can explain it. If a lectionary claims to be a variant of entryX/revisionY, we need to know that this is true. This is especially important when lectionaries and other spin-off publications are part of a translation plan. Trusting clients on this sort of thing was never a great idea but, in a future world where servers accept content from any conforming client, it's a non-starter.

But I don't see how recipeSpec is actually helping you do any of this. In fact, you seem to say the same later in the same response.

How would the server validate a recipeSpec? If I provide code to create a

lectionary, how does a server validate that code?

The server can't tell us whether what is produces is a lectionary, and that's ok. If users of a hypothetical Bible Library want to create recipeSpecs that claim to produce lectionaries but actually write "Happy Birthday" in Hebrew, that's absolutely fine. "lectionary", in this context, is just a user-supplied label.

If we have a tightly-defined recipeSpec language, the server can validate that language and, if it's valid, it can run it. The output may be more or less useful, but it's not going to mail /etc/passwd to the world. That's the level at which the server cares about this code.

But if the server cannot do better than that, how did the recipeSpec help it validate anything? If an instantiated variant is provided, the server can easily look at the data and verify that it is a lectionary. It can't really do that with the code.

mvahowe commented 4 years ago

If we had compelling use cases that clearly showed this to be a hard requirement, I would feel differently.

A compelling use case for me is that a hypothetical Bible Library has worked this way with a hypothetical desktop Bible translation tool for most of a decade.

If we want to support lectionaries, we might want to start by asking what flavor we would use and what the data would look like.

That might be interesting but it's not really the point here. I was using lectionaries as an example of something that could be produced as a derived variant of an existing textTranslation source. In the current proposal, strict semantics apply to derived variants, and it's specifically the validation of those semantics that interest me here. I don't think we can have derived variants without a way to describe how to make them, and if we can't do that we can't do what a hypothetical Bible Library might be doing now, let alone anything involving lectionaries, birth narratives or any other form of something hypothetically called incremental publishing.

jonathanrobie commented 4 years ago

On Sat, Mar 21, 2020 at 8:51 AM Mark Howe notifications@github.com wrote:

I think that LectionaryBuilder should be allowed to provide an instantiated instance without a recipeSpec.

Everything I've written anywhere confirms that LectionaryBuilder can do this.

OK.

In fact, it might be difficult to provide code equivalent to the internal code used to create the lectionary, which might be in several languages and come from different systems, as it currently does in Elbert's system.

(With considerable restraint.) Right, so maybe what we need is a shared language for specifying recipeSpecs? Your argument on this is completely circular.

I don't think my argument is circular at all. Agreeing on a shared language does not automatically produce programs.

Elbert can do this right now, producing lectionaries that we can put into DBL. He just can't create clojurescript equivalent to his macros and other code used in various systems to produce these lectionaries. Would you refuse to accept his lectionaries if he doesn't find a way to do this?

In my world, I see a wide variety of systems creating texts using various programming languages in various environments. If I create a syntactic analysis using XQuery from syntax trees in BaseX, it would be extremely difficult to produce a clojureScript equivalent, and the server might not have access to some of the data it would need regardless.

If I were to write a LectionaryBuilder from scratch, I might not know clojureScript and I might be allergic to parentheses. Odds are, I will write it in a language I use for other things.

So I am trying to understand where I would find this community of data producers who intend to create everything using clojureScript that will run in the server's environment and use recipeSpecs in this way. Can we find two different systems that need to exchange data in a way that requires recipeSpecs? If so, it would be good to have people familiar with these two systems describe their use cases and requirements.

mvahowe commented 4 years ago

In a physical library, the library generally doesn't care whether the content of the book is interesting or coherent or even if the typesetting is readable. What librarians do generally care about is knowing what books they have, and knowing how to classify what they have, and being able to say that, eg, this hardback 2nd Edition is related to the paperback 1st Edition.

If I turn up with a pile of books to donate to the local library, the library gets to choose how many of those books to accept and, more importantly, it gets to decide how those books are classified, and then the library's users expect the classification system of the library to work.

If the content of the book is trash, that's a problem for the author. If the book is misfiled, that's a problem for the library.

mvahowe commented 4 years ago

(FWIW I'm going off clojurescript for this. Well, actually, I love everything I read about clojurescript but I don't think it solves any of the hard problems here. I think rolling our own markup is a better option.)

mvahowe commented 4 years ago

Can we find two different systems that need to exchange data in a way that requires recipeSpecs? If so, it would be good to have people familiar with these two systems describe their use cases and requirements.

Not if we're not going to talk about actual organisations and products. But you and I represent two products that do this right now, albeit in an underspecified way.

jonathanrobie commented 4 years ago

On Sat, Mar 21, 2020 at 9:43 AM Mark Howe notifications@github.com wrote:

If we had compelling use cases that clearly showed this to be a hard requirement, I would feel differently.

A compelling use case for me is that a hypothetical Bible Library has worked this way with a hypothetical desktop Bible translation tool for most of a decade.

If we want to support lectionaries, we might want to start by asking what flavor we would use and what the data would look like.

That might be interesting but it's not really the point here. I was using lectionaries as an example of something that could be produced as a derived variant of an existing textTranslation source. In the current proposal, strict semantics apply to derived variants, and it's specifically the validation of those semantics that interest me here. I don't think we can have derived variants without a way to describe how to make them, and if we can't do that we can't do what a hypothetical Bible Library might be doing now, let alone anything involving lectionaries, birth narratives or any other form of something hypothetically called incremental publishing.

As you know, Paratext is working on a feature that does this, building on Bible Modules. It probably won't work the way you are thinking. It almost certainly will not generate clojureScript, it will create instantiated variants instead. It will probably rely heavily on templates rather than code, doing things as declaratively as possible. This is not yet in a state that it makes sense to discuss it in detail, but it will exist.

Here's something you said in the first post of this thread:

Considerations:

Possibilities of using MSWord/MSExcel combined with Paratext are limitless.

MSWord/MsExcel use VBA; maybe we can integrate py here

If we base the lectionary generator (tool II) on USX-3, the lectionary generator can be connected to DBL.

All Excel content can be protected and hidden. So only all variable info can be edited by users.

There are different calenders: for the Western Church and the Eastern Church based on a difference in the calculation of the Eastern date. These differences can be handled in the Excel sheets (and underlying calculations)

One of the most important characteristics of this use case is the need to be able to do things that go beyond what Paratext or any other editing program can do when creating data. For instance, liturgical calendars. As the first point says, the possibilities of using these tools together with Paratext are limitless, and Paratext cannot specify them all or turn them into clojureScript. People will always need to go beyond the capabilities of any one tool in order to create data.

As long as I can put an instantiated variant in a Scripture Burrito, this is not a problem. But I do need to know what flavor to use and what metadata to provide for a variant.

mvahowe commented 4 years ago

You can put an instantiated variant in a Scripture Burrito. Whether any idServer will believe that it is an instantiated variant of a specific entry/revision of a source will depend on idServer policy. I'm still not sure that you are using the word "variant" the way we defined a few weeks ago. It doesn't just mean "something different".

mvahowe commented 4 years ago

As the first point says, the possibilities of using these tools together with Paratext are limitless, and Paratext cannot specify them all or turn them into clojureScript.

If "limitless" means "the set of problems that can be solved algorithmically in linear time", Paratext could potentially specify them all in any Turing complete language including clojureScript. Also, if Paratext were to add this kind of functionality, it would end up specifying this in a Turing complete language (probably C#, or maybe Python).

jonathanrobie commented 4 years ago

On Sat, Mar 21, 2020 at 10:20 AM Mark Howe notifications@github.com wrote:

As the first point says, the possibilities of using these tools together with Paratext are limitless, and Paratext cannot specify them all or turn them into clojureScript.

If "limitless" means "the set of problems that can be solved algorithmically in linear time", Paratext could potentially specify them all in any Turing complete language including clojureScript. Also, if Paratext were to add this kind of functionality, it would end up specifying this in a Turing complete language (probably C#, or maybe Python).

In this case, though, Paratext would have to convert Elbert's Excel macros into some other language to do this. Or else implement the calendar logic he uses and whatever other functionality he needs that we do not yet provide. And even if we did that for Elbert, we cannot possibly do that for every other kind of document someone might want to create.

jonathanrobie commented 4 years ago

On Sat, Mar 21, 2020 at 10:03 AM Mark Howe notifications@github.com wrote:

You can put an instantiated variant in a Scripture Burrito. Whether any idServer will believe that it is an instantiated variant of a specific entry/revision of a source will depend on idServer policy. I'm still not sure that you are using the word "variant" the way we defined a few weeks ago. It doesn't just mean "something different".

Are you suggesting that this should simply be a new publication, rather than a variant? Would it have its own, new, Burrito?

mvahowe commented 4 years ago

And even if we did that for Elbert, we cannot possibly do that for every other kind of document someone might want to create.

And that's surely precisely why a language is better than a few tick box options - it can potentially handle any kind of transformation. Simple ones can be produced using some sort of GUI. More complex ones can be coded - probably by Elbert in this case. Excel doesn't enable all functionality by point and click. In some cases you need to type arcane stuff to get what you want. Plenty of non-technical users do this. Experts do not consider the existance of Excel macros to be a design flaw.

Are you suggesting that this should simply be a new publication, rather than a variant? Would it have its own, new, Burrito?

We're mixing up terms to such an extent that I can't give a simple answer to that. A source is a burrito. A variant is a burrito. The question is what kind of burritos any given system produces and is willing to accept. The answer to that question is really about ecosystems, which appear to be out of scope, which is pity since they are going to effect pretty much everything.

mvahowe commented 4 years ago

It's like banknotes. You can stare at the piece of paper for as long as you like, but you can only answer the question "What can I do with this" by adding the concept of "bank". What the piece of paper is worth depends on who produced it and how much you trust them. Without that you just have a piece of paper.

mvahowe commented 4 years ago

So, in the case of a lectionary, the answer to "What is this?" depends on what the creator claims it is, and on the willingness of any idServer to believe those claims.

If the lectionary is part of an incremental publishing plan, I think it could just about be considered to be a variant of textTranslation (which means it would itself be a textTranslation, and would have the same entry and revision id as the source from which it is derived). This makes sense because part of the aim of incremental publishing is to get indirect feedback on the source translation. So it matters a lot that the Scripture in the lectionary is actually the Scripture in the corresponding source.

We could also have a flavor to represent a lectionary. I imagine that being more like a recipeSpec, ie it provides information about which texts to use on which days in a format that could be instantiated with many different translations.

jonathanrobie commented 4 years ago

Elbert has a working system. The thing he seems to like about it is the limitless possibilities he has because he can generate things himself, doing things Paratext can't, but import it into Paratext and create a lectionary.

One of the Scripture Burrito requirements is to enable lossless round-tripping of data. Paratext creates Elbert's lectionary as data. It will continue to do this. The Scripture Burrito Working Group has repeatedly said that any client will be allowed to do this for variants. If creating a Scripture Burrito requires radical re-implementation of what we already do, it's not terribly helpful as a data exchange format. If DBL were to decide it does not want to accept Elbert's lectionaries, that does not help either Elbert or Paratext, but it's also not a concern for the Scripture Burrito Working Group, that would be discussed elsewhere.

So far, I haven't seen a requirement for recipeSpec in this use case. Elbert's system works right now without a recipeSpec. So far, I see no advantage to using a recipeSpec, and no way to create one for this use case unless Elbert writes it. After all, the defining characteristic of this use case is that the user can create things using functionality that Paratext does not have, and Paratext cannot create code equivalent to what it does not know anything about. I have pointed this out repeatedly, but you have not yet addressed that.

So far, I still haven't seen a convincing interoperability use case for recipeSpec - two applications that intend to use recipeSpec for interoperability in a data exchange scenario. We will need that to flesh out the feature, and we will need two working implementations for validating the design. In the W3C specs and other specifications I have worked on, each feature had to be justified by two independent interoperable implementations, and we would drop features that did not meet this test. And yes, features were dropped. I propose that we do the same in the Scripture Burrito Working Group.

I'm inclined to call YAGNI on this - You Ain't Gonna Need It. Including this feature will make it harder to finish our work.

jonathanrobie commented 4 years ago

... part of the aim of incremental publishing is to get indirect feedback on the source translation.

You keep using that word. I don't think you know what it means. When the feature has been fleshed out, we can talk about it.

mvahowe commented 4 years ago

I'm now working with Elbert on implementing his ideas in Scripture Burrito. I expect to demonstrate that writing scripts for SB is going to be more flexible than VBA.

You're welcome to call YAGNI but, if you look at how PT and DBL interact right now, you'll find that WAUI. Much of the USX consumed by YouVersion is generated from different USX uploaded by PT, using a primitive recipeSpec in the publications section, plus opaque semantics connected to licenses. We have to implement at least that level of recipeSpec.

I'm happy not to talk about any of this. Creating a de facto standard without all this conversation seems more and more appealing to me :-)

jonathanrobie commented 4 years ago

Much of the USX consumed by YouVersion is generated from different USX uploaded by PT, using a primitive recipeSpec in the publications section, plus opaque semantics connected to licenses. We have to implement at least that level of recipeSpec.

We do need support for both USFM and USX representations of the same resource. I just opened an issue suggesting a different way to do that, leaning on the way content negotiation is done in HTTP.

https://github.com/bible-technology/scripture-burrito/issues/170

mvahowe commented 4 years ago

This is not about USFM/USX, it's about performing algorithmic modifications to change USX into USX, which DBL does right now. (I believe it was a PT developer who designed this system.)

jonathanrobie commented 4 years ago

This is not about USFM/USX, it's about performing algorithmic modifications to change USX into USX, which DBL does right now. (I believe it was a PT developer who designed this system.)

Every system has to be able to do transformations in whatever programming language it prefers, drawing in whatever systems and data it needs. But that's not data-interchange, that's transformation.

mvahowe commented 4 years ago

If we follow your advice, DBL will stop working. Right now, PT effectively provides recipeSpecs to DBL. You want to stop doing that because we'll never need what we're already doing. This is very strange.

jonathanrobie commented 4 years ago

If we follow your advice, DBL will stop working. Right now, PT effectively provides recipeSpecs to DBL. You want to stop doing that because we'll never need what we're already doing. This is very strange.

Perhaps this would be a good use case? Can we outline what this does now, in a separate use case, and understand what requirements it implies?

I really work best by looking concretely at use cases in detail to understand the real requirements.

mvahowe commented 4 years ago

I realised, too late, that by opening this I was feeding the "use case are a substitute for hard work" delusion. Sorry.

bible-technology / scripture-burrito

recipeSpec use case: lectionaries #168

AN ADVANCED LECTIONARY GENERATOR

Main source: MS Excel

Generator tool I: MS Word

Generator tool II: Lectionary module in Paratext (XXA, XXB)

Considerations: