getodk / collect

ODK Collect is an Android app for filling out forms. It's been used to collect billions of data points in challenging environments around the world. Contribute and make the world a better place! ✨📋✨
https://docs.getodk.org/collect-intro
Other
713 stars 1.37k forks source link

Exception: java.net.URISyntaxException: Illegal character in authority #4759

Open grzesiek2010 opened 3 years ago

grzesiek2010 commented 3 years ago

Software and hardware versions

Collect v1.30.1+

Problem description

https://console.firebase.google.com/u/1/project/api-project-322300403941/crashlytics/app/android:org.odk.collect.android/issues/3abdb87d76f4ac9f125f4b3eef28c517?time=last-seven-days&versions=v2021.2.0%20(4242)&sessionEventKey=61039C9D00A000012CDB79543F45F7E7_1569048866029556996

Non-fatal Exception: java.net.URISyntaxException: Illegal character in authority at index 7: http://encuestas. se.gob.hn:8080/formList
       at java.net.URI$Parser.fail(URI.java:2893)
       at java.net.URI$Parser.parseAuthority(URI.java:3231)
       at java.net.URI$Parser.parseHierarchical(URI.java:3142)
       at java.net.URI$Parser.parse(URI.java:3098)
       at java.net.URI.<init>(URI.java:584)
       at java.net.URL.toURI(URL.java:973)
       at org.odk.collect.android.openrosa.OpenRosaXmlFetcher.fetch(OpenRosaXmlFetcher.java:94)
       at org.odk.collect.android.openrosa.OpenRosaXmlFetcher.getXML(OpenRosaXmlFetcher.java:54)
       at org.odk.collect.android.openrosa.OpenRosaFormSource.lambda$fetchFormList$0(OpenRosaFormSource.java:42)
       at org.odk.collect.android.openrosa.OpenRosaFormSource.lambda$fetchFormList$0$OpenRosaFormSource(OpenRosaFormSource.java)
       at org.odk.collect.android.openrosa.-$$Lambda$OpenRosaFormSource$oHfwyAcGQVxhQTD7_HASB3hwlMU.call(-.java:2)
       at org.odk.collect.android.openrosa.OpenRosaFormSource.mapException(OpenRosaFormSource.java:130)
       at org.odk.collect.android.openrosa.OpenRosaFormSource.fetchFormList(OpenRosaFormSource.java:42)
grzesiek2010 commented 3 years ago

This is funny, it's the most common issue in our reports for v2021.2.0 and it's because of having a white space in url (I don't know why it's so common). Since we already sanitize the url a bit in https://github.com/getodk/collect/blob/master/collect_app/src/main/java/org/odk/collect/android/openrosa/OpenRosaFormSource.java#L147 what about removing white spaces there as well?

seadowg commented 3 years ago

What does the user see in this case?

grzesiek2010 commented 3 years ago

If you keep having this problem, report it to the person who asked you to collect data. It doesn't help.

seadowg commented 3 years ago

Hmmm. It does feel like it'd be nice to fix in some way. I have two worries about sanitizing out spaces:

  1. Some users might expect spaces to be URL encoded (in a path for instance) which might be the right thing™️ to do here - although it doesn't solve the problem.
  2. Sanitizing would let "incorrect" URIs in QR codes work in Collect but not other clients that might end up supporting the QR code format, which feels wrong.

I'm leaning towards validating this when we save the URL instead. We've chatted a bunch about one day validating the server URL by actually hitting the server, so maybe this could be the beginning of that (just checking we have a properly formed URL whenever we save it).

Maybe we should consider this as a 2021.3 feature?

grzesiek2010 commented 3 years ago

Better validating would be great we should also catch that exception separately I think and display something that people will understand (that there is something wrong with the url) instead of If you keep having this problem, report it to the person who asked you to collect data.

grzesiek2010 commented 3 years ago

We should probably start with what is least risky:

seadowg commented 3 years ago

Yeah agreed. We actually have better error messaging in for 2021.3 I've just realized: https://github.com/getodk/collect/issues/4489. Thinking we just include this as a case in that issue and close this?

It looks like we also want to consider validation for https://github.com/getodk/collect/issues/4760, so maybe that can drive that out.