ckan / ckanext-validation

CKAN extension for validating Data Packages using Table Schema.
MIT License
28 stars 33 forks source link

google analytics extension conflict #18

Closed s-chand closed 6 years ago

s-chand commented 6 years ago

Resource Validation on ckan portal with google analytics extension set up fails.

Issue: GA intercepts resource download requests for event tracking before they are downloaded. This breaks ckanext-validation because it doesn't prepare for this temporary redirection.

Do you think this is an issue that can be reviewed on this side of the divide?

Datapusher works despite the GA redirects.

Steps to reproduce:

screen shot 2017-12-04 at 5 32 20 pm
amercader commented 6 years ago

Good finding @s-chand ! I was just struggling with this today with another extension that intercepts downloads. Perhaps this can be fixed by telling ckanext-validation to follow redirects, but I can't confirm offf the top off my head. I'll try to investigate and follow up.

s-chand commented 6 years ago

Ok thanks

amercader commented 6 years ago

@s-chand I had a closer look at this. The root cause is that the call that the validation job does to retrieve the resource does not pass any authorization headers. If this is a public file hosted externally this is ok, but if it is hosted by CKAN using an external backend then it fails.

I could not reproduce it with the GA extension on its own, but I could enabling the ckanext-s3filestore. Are you using any custom storage plugin?

@roll The main issue here is that I need to pass some headers or authorization parameters to the request that goodtables-py will do to get the remote source. I don't think that's possible out of the box, but maybe I can achieve it with a custom preset?

roll commented 6 years ago

@amercader goodtables-py accepts a request.Session parameter for it. But it's not possible to pass it thru the all layers. I think the simplest solution for it will be to introduce also a headers argument for tabulator's remote loader:

https://github.com/frictionlessdata/tabulator-py#httphttpsftpftps

So it will be possible to pass from JavaScript side as a part of validation configuration.

s-chand commented 6 years ago

@amercader You're actually quite correct. I'm also using the s3filestore extension. I should probably disable that considering that it is no longer necessary for my current project requirements.

Thanks for catching this.