Closed tedrick closed 5 months ago
Thanks @tedrick for this detailed report.
To flesh out the problem further - generating a XForm with the duplicate names as-is does not pass ODK Validate. I created a copy of the provided XForm, and manually inserted the duplicate itext node as shown below.
<translation lang="Korean (ko)">
<text id="/data/q1:label">
<value>ko__Duplicate choices__ko</value>
</text>
<text id="/data/q1/1:label">
<value>패스</value>
</text>
<text id="/data/q1/0:label">
<value>실패</value>
</text>
<text id="/data/q1/-1:label">
<value>해당 없음</value>
</text>
<text id="/data/q1/-1:label">
<value>건너뛴</value>
</text>
</translation>
Using ODK Validate v1.16.0, this resulted in the following error message. Without reading too far into the underlying jr:itext
function implementation, I guess the issue is that this function expects to be able to receive / lookup one text node only when it queries a language for a given text id.
>> XForm is invalid. See above for the errors.
org.javarosa.xform.parse.XFormParseException: duplicate definition for text ID "/data/q1/-1:label" and form "null". Can only have one definition for each text form.
Problem found at nodeset: /html/head/model[@xforms-version=1.0.0]/itext/translation[@lang=Korean (ko)]/text
With element <text id="/data/q1/-1:label"><value>
There is a XLSForm setting allow_choice_duplicates
(docs), and it is used in the provided XLSForm. The existing pyxform tests around this feature in test_fields.py
only shows this working for non-itext choices. In the tests, the duplicate choice values are in the XForm body under select1/item/value
.
@lognaturel is duplicate choices with itext a use case you'd want to support? I think it's inadvisable form design, and preferable to instead either use distinct choice values and collapse them in analysis, or use one choice with a label that reflects the duplicates e.g. "Skipped or Not Applicable". I don't fully understand the cascading select use case mentioned in the docs but it seems odd to need this setting for that.
An approach where the choice names are used as-is seems like it'd need a javarosa + validate + pyxform changes to support (and maybe collect / enketo as well). An alternative could be that pyxform appends something to the itext id (e.g. a dash and then a number such that it is predictable) to make sure the ids are unique if they aren't already, and allow_choice_duplicates = yes.
@lindsay-stevens - yes, this is in the context of allow_choice_duplicates
being enabled.
We generally agree on the inadvisability of having duplicate choice values in a choice list, though as has been commented elsewhere, there are a few cases where it makes sense (#373 has a good discussion on this). If it's decided to mark as a known limitation, I believe we're fine with that - this was generated though internal testing around some issues related to choice lists and we don't have a customer use case at this point time.
is duplicate choices with itext a use case you'd want to support?
Yes but low priority. As @tedrick notes, this is intended to work for the reasons described in #373.
Should be resolved by #603 / #614 because the choice itext identifiers in that case use the choice position (enumerated choice index) for the key instead of the choice value. But a test should be added to verify that the XLSForm in the description works as expected, +/- allow_duplicates.
User report and confirmation that it's fixed with the new version: https://forum.getodk.org/t/pyxform-duplicating-choices-labels/44037
Software and hardware versions
pyxform v1.7.0, Python v3.7.4
Problem description
When a form has multiple languages, the labels are stored whithin the
<itext>
section and referred to by ID. The ID pattern uses thename
of the choice as part of the ID string. If choices have duplicate values, only one of the duplicate choice's label is stored as all duplicate choices would have the same ID.Steps to reproduce the problem
Example form: multilang_duplicatechoice.xlsx
The resulting XML only has 3 choice entries in the itext section XML:
Expected behavior
Ideally, each choice will display as written in the XLSX file.
Other information
Given that the existing ID pattern is attempting to create a unique ID per string (as 2 questions with the same string will have different IDs since the question properties are used to generate the ID), perhaps the ID could be switched to a GUID?