Open isabelle-dr opened 1 year ago
I have created a PR a few days ago. The acceptance tests are failing. After analysis of this test, I found that the failure is due to the added time it takes to validate the urls. Some of our datasets have thousands of urls and it take approximatively 3-4 seconds to validate each (for 3000 url entries it adds at least 5min to the validation time). I don't think we can do any better on validation time. After consulting @davidgamez, we believe we might need to push back this issue until we have the custom validation profile (also mentioned in #1441) i.e. the url accessibility check would be an optional notice/validation. We believe it is essential as the validation is highly dependant on the user network and can affect the user experience. Thoughts?
I support delaying the issue until consumers can skip
a validation notice. Few points to support it,
must have
; not connected machines will get failing notices that cannot be silent.
Describe the problem
A user has asked if this validator could validate if the URLs provided in the GTFS dataset (e. g. agency_url, stop_url, etc) work as intended.
The specification says:
Although there is no explicit mention that the URL needs to not through a 404 error, this seems like a very useful addition to this validator that is in line with "fully qualified URL".
Describe the new validation rule
If one of the URL fields in the GTFS dataset through a 404 Error, generate a Warning.
Sample GTFS datasets
No response
Severity
WARNING
Additional context
No response