XLSForm / pyxform

A Python package to create XForms for ODK Collect.
BSD 2-Clause "Simplified" License
77 stars 134 forks source link

603: (secondary instances) fix or_other + translations + group or repeat #650

Closed lindsay-stevens closed 10 months ago

lindsay-stevens commented 10 months ago

Closes #649 Resolves issue mentioned here.

In _create_section_from_dict, the context is the nearest "section", which can be a survey, group, or repeat. When it's not the survey, the survey-level choices dict isn't directly available. So this is now tracked by the builder as a class-level object.

Why is this the best possible solution? Were any other approaches considered?

It fixes the error, while avoiding changing the API for SurveyElementBuilder, which is exposed at the top level __init__.py. An alternative to using a class-level variable would be to pass the survey dict around, but that would change the API.

What are the regression risks?

Maybe if library users use a SurveyElementBuilder instance in a weird way (like re-using an instance for multiple surveys, or directly calling private/underscored methods), it might produce odd results. But the main entry point for using this class is create_survey_element_from_dict, and the survey's choices dict is copied/written back in that function.

Does this change require updates to documentation? If so, please file an issue here and include the link below.

No, but I think fully deprecating or_other should happen at some point, since:

Similarly, this fix required code specific to "loops" which are a type of group, and it appears to be an undocumented feature - if it is deprecated then all code for it should be removed.

Before submitting this PR, please make sure you have:

lognaturel commented 10 months ago

Ah, very interesting.

it has significant drawbacks when used with translations (you get English "Other" or nothing "-")

~You don't always get -?~ Confirmed it's always in English.

Did you consider throwing an error in this case and saying or_other is not supported for multi-language forms? That could link to https://xlsform.org/en/#relevant where there's an example of using relevance to get the same behavior. That might be the more helpful option for users but let me know if you think I'm missing a reason that user would want to do this.

this fix required code specific to "loops" which are a type of group, and it appears to be an undocumented feature

Yikes! I'd always assumed that loop was an alias for repeat so I never looked into it. No, it's its own thing which injects a group for each member of a static list. Clever. I don't think it's every been documented and I can't find any evidence of its use. It definitely solves a real problem. Without it, you have to either build all the groups manually or use a repeat which can complicate analysis. This lead me to also notice include which we had been talking about. Turns out it already exists! So now we have two undocumented features to consider for deprecation or documentation.

lindsay-stevens commented 10 months ago

@lognaturel The default language gets Other, and any other language gets - for choices, and a non-itext Specify other. for the input label. That should reliably be the case, since there's XPath assertions for this behaviour in the or_other tests in test_translations.TestTranslationsChoices.

This PR is a bug fix because it's restoring previous functionality that was broken during the secondary instances PR - it's not an intentional deprecation. So throwing an error didn't seem right since a deprecation should have some kind of notice ahead of release. Similarly, adding a warning is sort of a new feature.

If you prefer to keep or_other for a while (say, the next 12 months at least), then adding a warning would probably be helpful. In that case, could you please let me know a preferred wording, and whether it should be per-row or once per form? If possible, I think once per form would be less spammy e.g.

This form uses or_other with translations, which is not recommended. It will generate in an untranslated input question label and choice label for "other". Learn more: https://xlsform.org/en/#specify-other

lognaturel commented 10 months ago

The default language gets Other, and any other language gets - for choices

I checked the user form on Enketo and got “Other” for every language, I’m pretty sure. I guess it doesn’t matter, it’s bad either way.

a bug fix because it's restoring previous functionality that was broken

True. My thinking is that it’s such a bad result that we really should have been erroring out before. That said, I’m ok with a once-per-form warning with your suggested text.

lindsay-stevens commented 10 months ago

@lognaturel I added a once-per-form warning when using or_other and translations together. The or_other usage is detected when at least one question uses a it. Translations are detected in the same way as the existing missing translations check (column names from survey or choices), and the or_other check won't fire if the only "language" is the internal default language.

To avoid additional processing overhead, the missing translations check code was refactored so that the translation detection result could be re-used for the or_other check (and other checks in future).

lognaturel commented 10 months ago

Thank you! I noticed #653 when reviewing but it's not directly related to this PR so we can discuss separately.