Open gushil opened 4 weeks ago
What is the actual bug (how is the filename with # treated?), and what causes it (in Enketo's code)?
The issue is existed when we are editing the form.
Uploading the file (Submitted successfully)
Close the form and load the form again
The problem is because the file with # char is not saved into instanceAttachments correctly.
We can see here the part after # is cutted off
We can see here the part after # is cutted off
Where in Enketo is it cut off?
We can see here the part after # is cutted off
Where in Enketo is it cut off?
instanceAttachments keys should be full file name with postfix. Because it is cut off, when the form loaded and trying to get the right key, we can't get it and we can't get the right file from instanceAttachments.
Thanks.
Because it is cut off, when the form loaded
So data-loaded-file-name
is set incorrectly somewhere? Where? Is it here or somewhere else? Why is it cut off? Is value
not correct? If so, why?
Hi @MartijnR
I finally can attach the file attachment through curl and found the problem is caused by enketo-transformer escapeURLPath
method that is called by enketo-express media createMediaURL
and escapeFileName
methods in getMediaMap
method.
Should we modify escapeURLPath
or is it better to modify either createMediaURL
and escapeFileName
to not use escapeURLPath
?
Thanks.
Great find! That made me suspect this issue may also be present in enketo/enketo, and I quickly confirmed that using kobotoolbox.org online. I filed the issue here: https://github.com/enketo/enketo/issues/1324
It would be great if you make your PR there and then we merge it here (or if we cannot wait for approval create PRs for both repos).
Should we modify escapeURLPath or is it better to modify either createMediaURL and escapeFileName to not use escapeURLPath
I'm not sure. What is the escapeURLPath method turning the URL with#
into?
Great find! That made me suspect this issue may also be present in enketo/enketo, and I quickly confirmed that using kobotoolbox.org online. I filed the issue here: enketo#1324
It would be great if you make your PR there and then we merge it here (or if we cannot wait for approval create PRs for both repos).
Should we modify escapeURLPath or is it better to modify either createMediaURL and escapeFileName to not use escapeURLPath
I'm not sure. What is the escapeURLPath method turning the URL with
#
into?
@MartijnR
With mediaPath = /media/get/1/a/ECG#1.png
, transformer.escapeURLPath(mediaPath)
turned mediaPath
into /media/get/1/a/ECG
Thanks.
Probably best to figure out what the intention of the escapeURLPath function is (maybe initially intended for form media?), and decide based on that.
@gushil, are you sure we are passing a correct URL in the API call without a fragment identifier/hash component?
I'm suddenly wondering if the bug is not in Enketo but in OpenClinica and KoBoToolbox due to not encoding special URL characters in the filenames when making the API call to Enketo.....
Hi @MartijnR
I've tested the issue with enketo-oc and centro in my local setup, and the behaviour is similar with the one deployed in the server.
Thanks.
@gushil, so if I understand you correctly, you are already url-encoding the filename part of the URL you are sending to Enketo in the instance_attachments
part of your curl
request when reproducing this with Centro?
No. I just send the file name as it is like what happened when it is manually uploaded with the enketo ui because when I debug uploading the file in enketo ui, no url-encoding happened.
Am I wrong doing that?
Really appreciated your suggestion.
Thanks.
On Wed, Jun 19, 2024, 18:19 Martijn van de Rijdt @.***> wrote:
@gushil https://github.com/gushil, so if I understand you correctly, you are already url-encoding the filename part of the URL you are sending to Enketo in the instance_attachments part of your curl request when reproducing this with Centro?
— Reply to this email directly, view it on GitHub https://github.com/OpenClinica/enketo-oc/pull/205#issuecomment-2178427616, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAOX627R6BSWNJKGBMB2JTZIFSLPAVCNFSM6AAAAABIWXG7WGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNZYGQZDONRRGY . You are receiving this because you were mentioned.Message ID: @.***>
The ‘#’ is a reserved character (for a fragment identifier ) in a URI, so it needs to be encoded when it is part of a URL (e.g. in the API request as the value for an instance_attachment item). Enketo doesn’t send it as a URL during submission so doesn’t have to encode it. Did you look closely at the 2 screenshots I posted?
If this is not clear please send the complete curl snippet you are using and I will suggest a change.
Hi @MartijnR
Yes, I saw your screenshots. but I wonder if we upload the file to the form, is it uploading with full path (like https://openclinica.com/media/get/1/a/ECG#1.png) ?
Thanks.
but I wonder if we upload the file to the form, is it uploading with full path (like https://openclinica.com/media/get/1/a/ECG#1.png) ?
That URL is not crafted by Enketo. The URL is provided by OpenClinica when it makes the API call (using instance_attachments
) Enketo just proxies that URL via the Enketo server (adding the media/get
etc).
Enketo only submits the filename. So it's up to OpenClinica to provide a correct URL (which we're now realizing should have a url-encoded filename). OpenClinica decides to make the filename part of the URL. Enketo doesn't require that. It could be any URL (but it makes sense to make the filename part of that URL).
I recommend you first spent 1 minute testing this in your cURL snippet (replacing #
with %23
in the instance_attachments
URL), before continuing this discussion.... Then at least we know we are on the right track!! (I don't have a properly crafted cURL snippet handy myself that reproduces the issue).
Oh @gushil, I think I finally see what you mean! That code has completely changed, and that media/get URL is actually created by Enketo! Is the file stored in the redis database correctly if the URL provided in the API request is wrong (which it still is)?
Hi @MartijnR
With this form (centro/storage/forms/comment_repeat.xml)
<?xml version="1.0"?><h:html xmlns="http://www.w3.org/2002/xforms" xmlns:OpenClinica="http://openclinica.com/odm" xmlns:ev="http://www.w3.org/2001/xml-events" xmlns:h="http://www.w3.org/1999/xhtml" xmlns:jr="http://openrosa.org/javarosa" xmlns:oc="http://openclinica.org/xforms" xmlns:odk="http://www.opendatakit.org/xforms" xmlns:orx="http://openrosa.org/xforms" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><h:head><h:title>Comment repeat</h:title><model odk:xforms-version="1.0.0"><instance><data id="comment_repeat" version="version-1"><group_yt6qb39 jr:template=""><comments/><comments_001/></group_yt6qb39><group_yt6qb39><comments/><comments_001/></group_yt6qb39><upload_image/><meta><instanceID/></meta></data></instance><bind nodeset="/data/group_yt6qb39/comments" oc:itemgroup="RG1" type="string"/><bind nodeset="/data/group_yt6qb39/comments_001" oc:itemgroup="RG1" type="string"/><bind nodeset="/data/upload_image" oc:itemgroup="RG2" type="binary"/><bind jr:preload="uid" nodeset="/data/meta/instanceID" readonly="true()" type="string"/></model></h:head><h:body><group ref="/data/group_yt6qb39"><label></label><repeat nodeset="/data/group_yt6qb39"><input ref="/data/group_yt6qb39/comments"><label>Comments 1:</label></input><input ref="/data/group_yt6qb39/comments_001"><label>Comments 2:</label></input></repeat></group><upload mediatype="image/*" ref="/data/upload_image"><label>Upload image</label></upload></h:body></h:html>
and this image url:
I'm using this cURL snippet:
curl --user enketorules: -d "server_url=http://localhost:3000&form_id=comment_repeat&ecid=1&instance_id=a&instance=\
<data xmlns:OpenClinica=\"http://openclinica.com/odm\" xmlns:enk=\"http://enketo.org/xforms\" xmlns:jr=\"http://openrosa.org/javarosa\" xmlns:oc=\"http://openclinica.org/xforms\" xmlns:orx=\"http://openrosa.org/xforms\" id=\"edit_comment_repeat\" version=\"test\">\
<group_zt8fu31_002>\
<Patient_Signature_002_001>ECG.png</Patient_Signature_002_001>\
</group_zt8fu31_002>\
<meta>\
<instanceID>uuid:376611b3-1506-4c11-b21a-3a246f6574f2</instanceID>\
</meta>\
</data>&instance_attachments[ECG#1.png]=http://localhost:8000/ECG%231.png" http://localhost:8005/oc/api/v1/instance/edit/c
that returns
"url": "http://localhost:8005/edit/fs/c/i/05244e567c9854acdd6a22819a3fa573?ecid=1&instance_id=a"
When loading the form edit url http://localhost:8005/edit/fs/c/i/05244e567c9854acdd6a22819a3fa573?ecid=1&instance_id=a
I got this:
Closes #204
I have verified this PR works with
What else has been done to verify that this works as intended?
Why is this the best possible solution? Were any other approaches considered?
How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?
Do we need any specific form for testing your changes? If so, please attach one.