getodk / web-forms

ODK Web Forms enables form filling and submission editing of ODK forms in a web browser. It's coming soon! ✨
https://getodk.org
Apache License 2.0
10 stars 9 forks source link

External secondary instances: I/O support for `jr:` URLs #201

Open eyelidlessness opened 2 months ago

eyelidlessness commented 2 months ago

This design issue is part of broader support for external secondary instances:

The intent in this issue is to decide on a direction for how we want to handle retrieval of form attachments generally, with specific goal of supporting external secondary instance functionality.

These are the pertinent aspects of the ODK XForms spec.

The issue will focus on the engine/client interface to support the retrieval of jr: URLs. As I tend to do for engine/client interface design, I will present a few options which we can choose from or iterate on. Implementation in the engine will be derived from there.

Note: it is expected that the design we choose here will also lay groundwork for supporting other jr: URL use cases—i.e. media attachments—so I've tried to be mindful of that (as we should in discussion).

Note: this issue does not currently address the jr://instance/last-saved virtual endpoint, but I believe nothing in any of the proposed options would block or impede that functionality, when we're ready to address it.

Option 0: client/host application handles resource resolution

This is a true null option: in the narrowest sense, we could claim this work is already done with the provision of a fetchResource configuration option. This option is technically sufficient to satisfy the engine's spec responsibilities.

How this would work:

At least until we support offline mode (or any other functionality that would imply runtime-level caching/persistence), this would leave resource resolution entirely to clients.

This is sort of the opposite of a "pit of success" option, with its primary appeal being limited engine-side work for this aspect of the targeted feature.

Beyond obvious non-"pit of success" drawbacks, I'll specifically note that it's the most likely option to result in disparities and drift between clients. It's also likely to promote disparities/drift between different functionality which intersect with it.[^1]

[^1]: This has been a particular pain point in Enketo. Support for jr: URLs is spread across three packages, and difficult to iterate on even after moving the projects to a monorepo.

Option 0.1: engine does not handle this aspect at all

Another null option variant.

This would effectively mean that clients must resolve jr: URLs before initializing a form. They'd probably supply the resources as data: or blob: URLs, substituted directly in the form definition provided by clients to the engine.

This option does not appeal to me, but I think it's worth mentioning so we can make a thoroughly informed decision.

Option 0.5: engine provides resolution handler(s) for common cases

An extension of option 0, this is similar in spirit to the submission API proposal (#188), and some of the discussion ongoing there. The idea would be that we recognize one or more typical resource mapping schemes, and expose default fetchResource implementations to address those (likely as some kind of factory function so clients can parameterize them for per-instance usage appropriately).

I would imagine starting with handlers for:

Option 1: engine provides one or more explicit mechanisms for form attachment resolution, tailored to feature-specific use cases

Instead of the engine calling a generalized fetchResource option with a jr: URL, the engine would instead accept a configuration mapping between specific jr: URLs to one of:

The mapping itself could be any of:

sadiqkhoja commented 2 months ago

I was thinking if host application could just provide secondary attachments along with Form XML so engine doesn't have make network calls and deal with all the network related errors. I am saying this with the assumptions:

This is closer to the Option 1 presented above, except required attachments are resolved at the host application levle before anything else happens.

eyelidlessness commented 2 months ago

I want to make sure to sum up a couple key conclusions from our discussion yesterday:

Design choice: Option 1, possibly supplemented by Option 0.5

We decided to go with Option 1. As a stretch goal, we may also include aspects of Option 0.5 as an additive aid to clients/host applications.

We refined the proposed Option 1 interface. Putting aside naming (included here so the type will be valid syntax/highlighted as such), this is the interface we anticipate clients/host applications to provide for all form attachments:

type FormAttachmentMapping = Record<`jr:${string}`, () => Promise<Response>>;

Open for bikesheddy discussion: is there any openness to making this a Map rather than a Record? @sadiqkhoja your call. I would generally like to move away from using plain objects for bag-of-stuff collections where the keys aren't at least partially known/fixed at design time. But I can also understand this may be less convenient at the package boundary.

As for the value side of the mapping, providing a thunk per resource:

We decided to represent each resource as a Response because it has semantics appropriate for likely usage scenarios, and suitable for error-reporting designs in #202 (without placing undue burden on clients to conform to a stricter Result-like representation as we've chosen there).

On engine invocation of form requests broadly (i.e. including media)

While this design is primarily focused on support for external secondary instances, it has obvious implications for other form attachments. We discussed this, which was also raised in the above comment. We determined it makes sense for the engine to invoke all such requests, largely for reasons discussed in the last section. Notably, we expect that the engine will perform media requests earlier in a form session (either as part of form load, or perhaps immediately following resolution of the initial form state).

Addendum to yesterday's discussion of this point

As an additional point not covered yesterday, but which I think helps to bolster this decision: insofar as the engine expects clients/host applications to provide resource data, there's another very good reason for the engine to invoke requests, and in particular to for the engine to get those requests back as a Response. Namely, streaming. If we expected:

And the engine does need to access at least some resource data for media attachments, as they may be associated with node values.


In the future, we might consider expanding this interface to provide multiple representations, such as something like Record<'jr:*', { request: () => Promise<Response>, url?: string }>. But I think it's best to stick to the simpler interface for now until we have some time to integrate it and understand from experience where there are real gaps.

brontolosone commented 2 weeks ago
  1. About

    [...] represent each resource as a Response [...]

Good, that leaves room for the implementer to supply the data any which way.

However, gor the engine it's less convenient as the annoying thing about responses is that you can read them (eg call .blob() or .text() on them) only once, so the engine will have to do some work - for instance in the case of secondary instances, read once and deserialize the xml while for the case of media URLs - pictures etc - it'll probably want to use createObjectURL() so that it'll have a reusable URL to use in eg <img src="the-object-url"> or what have you. Or, it could call the function-value in the below FormAttachmentMapping again under the assumption that the instantiator would then give it a freshly constructed, not-yet-read response. But then ideally that expectation would be reflected in the FormAttachmentMapping type. (and even if the type enforces it, people writing their own instantiator (which I hope people can do, I think it should be modular that way) directly against some JS wrapper might miss that detail, reuse responses, and won't understand that mistake months down the road when a survey needs to show image twice — so that's a bit fragile). Probably the most prudent and robust thing to do is to write the engine with the assumption that one can only call the functions of FormAttachmentMapping once. Of course it's free to .clone() the responses or do whatever needs doing.

Perhaps the most succinct typification of the expectations of the engine is... a blob after all? Initiator supplies blobs? Plain and simple?

  1. About

    type FormAttachmentMapping = Record<jr:${string}, () => Promise>;

Perhaps a name like FormAttachmentRetrievalMapping would be more descriptive, to pre-empt the question of "mapping to what? for what?".

But to the point. This leaves it up in the air how the instantiator would know what jr: URIs are actually used in the form. An instantiator could run a bit of xpath on the form xml, but it shouldn't have to, that kind of knowledge belongs inside the webforms engine. It could also get it as metadata from the place it gets the survey definition itself from — say Central. But that's just kicking the can down the road, because how does Central know? Run a bit of xpath? At some point we'd like this engine to be able to authoritatively answer "what attachments is this survey referencing?", and hopefully at that point we might use it as such in Central to construct the nudge for uploading form attachments. But at that point an instantiator should be able to get the same answers from the engine.

So I think a two-stage mechanism can be useful. One, the engine can load a survey definition and return a list of jr: URIs referenced. Two, the instantiator does its thing to construct the FormAttachmentMapping. Three, the instantiator instantiates (obviously), with that mapping. Furthermore I think the I think error handling around those responses should be very thin inside the engine itself — simply fail loudly — as it's the instantiator's responsibility to handle the case where it cannot meet the precondition. Lazyloading things is thus the instantiator's risk!

eyelidlessness commented 1 week ago

Thanks @brontolosone, lots of good stuff to dig into here!

Good, that leaves room for the implementer to supply the data any which way.

Yep! That is exactly the intent. We just want to set a baseline for read semantics, without being proscriptive at all about how the thing we read gets into that state. And the read semantics of Response work really well for that:

However, gor the engine it's less convenient as the annoying thing about responses is that you can read them (eg call .blob() or .text() on them) only once, so the engine will have to do some work - for instance in the case of secondary instances, read once and deserialize the xml while for the case of media URLs - pictures etc - it'll probably want to use createObjectURL() so that it'll have a reusable URL to use in eg <img src="the-object-url"> or what have you.

This is more or less what I anticipate the engine doing. At least for a first pass, there will be a little bit more nuance in terms of timing (i.e. we may block initializing state for external instances, and kick off other attachment requests immediately after). There are some other nuances to think about, around the nature of certain network calls (i.e. streaming). But otherwise this pretty much captures the behavior we intend to implement.

Or, it could call the function-value in the below FormAttachmentMapping again under the assumption that the instantiator would then give it a freshly constructed, not-yet-read response.

I appreciate you raising this concern! I think we should make it explicit that we WILL NOT do that. As I mentioned in part of our call a bit earlier, that guarantee is one of the specific reasons we chose Response for this: its semantics preclude repeatedly resolving the same body (or abstraction thereof). As you put it:

But then ideally that expectation would be reflected in the FormAttachmentMapping type.

The intent of not reflecting that expectation is to help communicate that we don't have that expectation.

Perhaps the most succinct typification of the expectations of the engine is... a blob after all? Initiator supplies blobs? Plain and simple?

Also discussed on our call a bit ago, part of the intent of taking a Response is in support of #202. Since Response has clear semantics for communicating error conditions, it's well positioned as a mechanism for us to collect those error conditions along with any others that occur during form init, and provide a uniform mechanism for conveying them back to the client.

Capturing another, related part of our discussion here for posterity...

The intent of the design is also to support clients which may want greater (or even total) control over this set of error conditions (as in, greater than the easy default we're setting up: simple functions wrapping fetch calls). Because Response and its component parts can be constructed out of basically anything, clients can:

Perhaps a name like FormAttachmentRetrievalMapping would be more descriptive, to pre-empt the question of "mapping to what? for what?".

I'm fine with that name if you like it better. Again, the name as proposed is mostly there so there's a concept we can reference unambiguously. I do think in general the "mapping to what" is answered by the type itself, and the "for what" is mostly answered by requisite familiarity with underlying details of the spec. But I am always in favor of naming clarity as a way to help reinforce assumptions!


But to the point. This leaves it up in the air how the instantiator would know what jr: URIs are actually used in the form. An instantiator could run a bit of xpath on the form xml, but it shouldn't have to, that kind of knowledge belongs inside the webforms engine. It could also get it as metadata from the place it gets the survey definition itself from — say Central. But that's just kicking the can down the road, because how does Central know? Run a bit of xpath? At some point we'd like this engine to be able to authoritatively answer "what attachments is this survey referencing?", and hopefully at that point we might use it as such in Central to construct the nudge for uploading form attachments. But at that point an instantiator should be able to get the same answers from the engine.

So I think a two-stage mechanism can be useful. One, the engine can load a survey definition and return a list of jr: URIs referenced. Two, the instantiator does its thing to construct the FormAttachmentMapping. Three, the instantiator instantiates (obviously), with that mapping.

Again capturing content from our earlier call:

However! As I think more on this, I think we probably should relax the type of the key. At least from memory, I'm pretty sure the already-produced information we expect clients to have will be a mapping between filename and I/O-resolvable-resource. I think it's reasonable to accept:

type FormAttachmentRetrievalMapping = Record<`${string}.${string}`, () => Promise<Response>>;
// ... or, preferable IMO:            Map<...>
// ... if base.ext assumes too much:  WhateverBagOfKVs<string, ...>

Furthermore I think the I think error handling around those responses should be very thin inside the engine itself — simply fail loudly — as it's the instantiator's responsibility to handle the case where it cannot meet the precondition. Lazyloading things is thus the instantiator's risk!

It's actually a bit more complicated than this. It's discussed a bit in #202, and we covered it a bit on our call, but I think it's really important to highlight here too: we are anticipating that at least for some usage scenarios, I/O failures are not inherently terminal. This is one of the biggest reasons the engine has an I/O responsibility (or this abstraction over it) at all:

This does actually raise the question of whether we're sufficiently serving the second point! It is certainly possible, with the proposed interface, for us to implement that interface in the engine so a client can totally block at least the stateful (post-parse) portion of form init. (It's possible because we already accept a similarly opaque I/O abstraction for the form definition itself.)

But it certainly wouldn't be very ergonomic to do so. I don't think Response is an ideal way to convey "this operation failed, and all other related operations should fail along with it."

At this point, I think that portion of discussion is probably better had in #202. But briefly, my instinct is that we don't yet know enough about partial/total failure from a client perspective to merit thinking too hard on it yet.

eyelidlessness commented 1 week ago

However! As I think more on this, I think we probably should relax the type of the key. At least from memory, I'm pretty sure the already-produced information we expect clients to have will be a mapping between filename and I/O-resolvable-resource. [...]

As soon as I posted this, I realized the likely mistake in my reasoning. While it's common to map base.ext <-> jr://whatever/path/to/base.ext, it isn't a requirement. I do think we should consider accepting filenames as the common use case, but I'm not sure whether that's sufficient. But I'm betting @lognaturel will have thoughts on that!

lognaturel commented 1 week ago

While it's common to map base.ext <-> jr://whatever/path/to/base.ext, it isn't a requirement.

I've read through the last couple of posts but I'm not able to follow all the details. Can you give me a sense of what other kinds of mappings might exist?

eyelidlessness commented 1 week ago

So let's say we have a form definition which references these form attachments:

Taking the OpenRosa Manifest Document example as one of our typical cases, it would be sufficient to treat the manifest's filename as a key. We'd need to compare against the URLs with "jr://" and the first path segment removed.

But here already, base.ext isn't sufficient, as there are potentially subpaths. I'm glad I looked, that's already clarifying!


If we have another form definition referencing:

The ODK Central REST API docs' section on listing form attachments also suggests a key of name would be sufficient.

But wait. The Central API example has "type": "image"! 😅[^1]

[^1]: This was just a lucky bit of silliness that happened to illustrate part of the point. I'd volunteer to make the example make a little bit more sense... but I'm not sure where the source for these docs lives!

Suppose we have a form definition which references:

Now it seems likely we'll want to allow data type as part of the key. But we probably can't require it, because the OpenRosa manifest doesn't provide it.


That covers our two most common use cases, but we can't assume either. The ODK XForms spec doesn't really say how these URLs map to anything, at all really. And there's ambiguity in both of the common use cases' examples. As I understand at least the ODK XForms side of things (and this is how it's handled in Enketo), a jr: URL can be mapped to literally any URL at all, without any common substring key between them. I think that's why I originally landed on treating the full jr: URL as a key.

Also, based on experience on the Central frontend, I do know there's some accommodation for uploading form attachments that don't match the filename as defined in the form. I don't know offhand how that mismatch is represented at the API level. But if it doesn't match the jr: URL suffix we'll definitely want to account for it.


I think the options are:

lognaturel commented 1 week ago

Filename (base.ext) key: handles the obvious cases

This feels like a good place to start to me. It should be possible to expand to one of the others from there if ever we did find it was needed, does that sound right? We could document the subset of the spec that is supported.

I'm not aware of any system that allows expressing a subpath on the client side. Collect downloads all media declared in the manifest to a single directory. The source could come from some URL including a subpath but there's no way to get it to save in a subdirectory on the client.

jr://image/, jr://audio/, etc all map to that same directory in Collect. I believe they were differentiated in the J2ME days so you could do things like specify that all images should be stored on auxiliary storage.

XLSForm has different columns for specifying audio, video, images and different constructs for specifying data files. It uses that usage context to generate the jr://image/, jr://audio, etc prefixes.

Central has no notion of subdirectories for media either.

do know there's some accommodation for uploading form attachments that don't match the filename as defined in the form

Central ignores the "actual" filename in that case and saves the file content in a slot with the filename referenced in the form.

The Central API example has "type": "image"

Whoops! PR at https://github.com/getodk/central-backend/pull/1247 This would be possible with XLSForm -- a user could put a .mp3 file in the image column and what is now in the API docs would be the outcome in Central. Collect would happily consume it too and would try to render it as an image. It might crash or hopefully show an error message if the file is truly an mp3. I think it would succeed if the file were a jpg with an mp3 extension, for example.

brontolosone commented 1 week ago

My understanding:

I like to think of the "full jr: URL" as the de-jure identity of an attachment. So not really an URL, URI or URN. Just an opaque ID. So let's take that as an axiom. It's how attachments are referenced in the form XML, so that's my gospel, my source of truth :pray: Again: axiom. It may be wrong! Perhaps we have a need for it being an URL/URI/URN but I haven't seen a comprehensive articulation of that (or I haven't looked hard enough), I stand to be enlightened ;-)

However. Outside of the form XML treatment of the "attachment identity" is a different story. For instance Central has a form_attachments table with columns (formId | blobId | name | type | formDefId | updatedAt | datasetId) and stuffs the "file name part" in the name column; essentially deriving a new de-facto identity from the de-jure identity. ODKCollect at one point in time audaciously stored the attachments with the derived filename in the Android filesystem. And then the form definition can have <value form="image">jr://images/b.jpg</value> where the images part in the URI looks redundant indeed. All of that muddles the question of what the identity of an attachment is, which is why we end up discussing it right here more than a decade later still :laughing:

So to take an extremist standpoint: if all components (Central, Webforms, Collect) have historically done, and will keep on doing, interpretations of that thing (variations of "oh we feel like it's actually just a filename and we're going to discard parts of the identity") rather than keeping the ID the ID, then why do we bother having these jr: URIs at all? Just use the "filename" everywhere then; it's the de facto ID! (except now it's no longer a filename, it's the ID :clown_face:

The orthodox (also somewhat extremist) standpoint would be: the de-facto usage is heretic, the URI is an opaque identifier, it doesn't contain any filename, it doesn't contain an instruction on where to store it if you happen to be a J2ME client or even if it seems to then that's a historical error because the pious shouldn't store instructions in an identifier, so we should practice restraint on our instincts to treat it as anything else but an opaque identifier. The one point at which we should allow ourselves to succumb to "filename urges" is in Central-like scenarios when we need to ask the user to supply the attachments that the form is missing; and the user reasons in files, browsers allow picking and uploading files, and probably the user was thinking in files when they made their .xlsx to start with. There, in the filename-y kingdom of sin, we would allow ourselves to talk about "filenames" — but once those uploads are done we wash our mouths with soap and will thenceforth only speak of attachment identifiers.

I think I'm more on the Orthodox than on the Reformist side here, because I want there to be a clear and atomic identifier somewhere. Also, if we can get away with treating the jr: thing as an opaque identifier rather than worrying about the semantics of inner structure, we should! I think we can. There are no current use cases for inner structure, are there? And per that dogma, we should thus go with:

  • Full jr: URL: handles the ambiguity of the ODK XForms spec

And then if a consumer of that API (could be us, will be us, but outside of the sanctuary of the webforms xforms engine) wants to do "interpretation" of those URIs, they may do that in the privacy of their own homes to match immaculate attachment IDs to earthly filenames.

lognaturel commented 1 week ago

they may do that in the privacy of their own homes to match immaculate attachment IDs to earthly filenames.

And this could potentially include the web forms Vue frontend? Host applications likely don't know the "attachment identifiers"/jr:// URIS so requiring those as keys feels like it could be a barrier to usage. Does that sound right?

brontolosone commented 1 week ago

they may do that in the privacy of their own homes to match immaculate attachment IDs to earthly filenames.

And this could potentially include the web forms Vue frontend?

I think we could let our Vue frontend be the place to perform that match, yes. It should reference this issue in a code comment ;-) I suspect that full reactionary-orthodoxy would say that actually, Central should have been returning jr:// identifiers at the attachment listing endpoint all along, and that would have allowed us to stay in ID-land and then we wouldn't have to do any matching.

Host applications likely don't know the "attachment identifiers"/jr:// URIS so requiring those as keys feels like it could be a barrier to usage. Does that sound right?

Just FTR, the implementer's task sequence I see roughly as follows:

  1. ask the webforms engine what attachments it sees referenced in the form xml
  2. it returns {"jr://file/blabla.png", …}
  3. I learn from Central what attachments are available for this form, it replies with earthly filenames: ["blabla.png", …]
  4. As a lazy developer, potentially unfamiliar with ODKXForm internals, I go "uhhhhh... Well OK, let me regex the 'filenames' out from those funny URL-y 'ids', piece of cake". Or I go "uhhhh... ok let me prepend "jr://file/" to all of those filenames I got from Central", also works.
  5. I supply the {"file-id": () => Promise<Response>} (or {"file-id": Blob}, as I think I'd still prefer) mapping to the webforms engine.

With that in mind, back to:

Host applications likely don't know the "attachment identifiers"/jr:// URIS so requiring those as keys feels like it could be a barrier to usage. Does that sound right?

It depends! Perhaps in the host application's further ecosystem (say they have their own "Central"-like backend) the original sin of "Let's go off on tangents by extracting "filenames" and other meaning from those identifiers" was never committed, and in that case they won't have to do any mapping. Such a host application will be able to talk to their backends to express "give me the blob for jr://file/asdasdasd.asda" and no one would need to be thinking or talking about "filenames".

For those that have sinful backends however, such as we do, we could make a utility shim or library that contains a function that does the mappings. We'd use it ourselves in our Vue frontend, and other host application developers can steal/use it (via a simple git submodule if need be, it doesn't need to be a full blown npm package) if they have the same sinful legacy as we do. That keeps the haram fiddly stuff contained there, rather than inside the webforms code (which would condone and enshrine it, and as such would proliferate the animist myth that (odk)xforms says something about "filenames").

So that'd be the "reactionary orthodox" view on the matter. It's not unreasonable I think?

There's still the "revolutionary reformist" view, which asks "Well… if we live and breathe filenames, then why are we going through those pagan seances of converting them back and forth into/from those jr:// URLs, can't we get rid off that useless liturgy?". But following up on that would break everyone's parsers and cause a schism, so I think the orthodox view is the more pragmatic and practical.