Closed yanokwa closed 2 years ago
I think exactly those restrictions are why I generate the name as described:
I prefix with build_
so that the leading character cannot be a number.
I postfix with the timestamp because aggregate chokes if multiple forms have the same ID and people were having problems with that.
If you'd prefer risking the user's own input I'm happy to go back to that behaviour.
I think my reference to the form ID confused things!
If you have a form name 1 Form
and you export it, you get 1-Form-export
. This XLSForm, when converted with pyxform will fail because 1-Form-export
is used as the name of an XML node and nodes cannot start with a number.
Is it possible to add validation to form names?
I'm just going to prefix a _
if I see a leading number. Cool @yanokwa ?
Hmm. Why not use validation on the names?
@lognaturel had an alternate idea. Given that the form is valid, maybe we should file this as an issue against pyxform and it should handle numbers more gracefully.
I'm okay with either! This is a pretty simple change on this side, though. Just catch XLSForm files on their way out the door and subst the attachment meta name.
I'm happy with leading numbers being prepended by _
.
I've filed this upstream: https://github.com/XLSForm/pyxform/issues/130. I propose we hold off on a fix until the pyxform team has a chance to respond.
This is still a trivial one-line change on our side. I'm happy to do it or to do nothing; please advise! :)
I say do nothing for now. I'd rather fix this upstream.
Okay, removing from the milestone.
Just to document current behaviour, a form "Test040" will be exported by build2xlsform v1.6 as "Test040-export.
Leading with the (sanitised) form title seems OK now.
I could even live (better) without the "-export" postfix because 1. the file extension clarifies the form standard and 2. including the exported form in R packages throws an error and I have to manually remove the -export
from each file. The latter is an edge case that probably not many users suffer from.
Update: PR #266 changes this to "Test040.xslx". Update 2: PR #271 could include #267 which might resolve this issue here.
@yanokwa is this issue addressed?
To document current behaviour, a form named "290test4" exports to "290test4.xslx" (no errors) with an internal form title and form id of "290test4". Uploading that to ODK Central (1.3) throws no errors. Converting the XLSForm back to XLS also seems to work, no errors are raised:
<?xml version="1.0"?>
<h:html xmlns="http://www.w3.org/2002/xforms" xmlns:ev="http://www.w3.org/2001/xml-events" xmlns:h="http://www.w3.org/1999/xhtml" xmlns:jr="http://openrosa.org/javarosa" xmlns:odk="http://www.opendatakit.org/xforms" xmlns:orx="http://openrosa.org/xforms" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<h:head>
<h:title>290test4</h:title>
<model odk:xforms-version="1.0.0">
[...]
</itext>
<instance>
<data id="290test4" version="1643329150">
...
Does that mean that form titles beginning with numbers are now handled well on the XLSForm side, and this issue here can be closed?
Ah yes indeed! The file name used to be used as the node name for the child of the instance
. Now it's always data
.
When pyxform converts XLS to XML, it uses the XLS file name as the root node name. And because XML has special character requirements (e.g., names must begin with a letter, colon, or underscore, subsequent characters can include numbers, dashes, and periods), we have to be careful about the name of the export.
One way to make this name compliant is to use the form ID in the export, but drop "build_" and the timestamp to make it more human-friendly. So for example, if your form id isedit by @issa-tseng; this section is confusing as it refers to things Build does not actually dobuild_Favorite-Color_1480937828
, the exported file name should beFavorite-Color
.With this strategy, it might still be possible to have an invalid root node name, so the export should also handle names that start with invalid characters.