openspending / datapackage-pipelines-fiscal

Fiscal Data Package extensions to Datapackage Pipelines
MIT License
3 stars 6 forks source link

Malformed URL for finalized datapackage #12

Closed brew closed 6 years ago

brew commented 6 years ago

A malformed url to the datapackage.json is created here: https://github.com/openspending/datapackage-pipelines-fiscal/blob/5388e8f1bbf7ed239b479ffac97d647b130a89ef/datapackage_pipelines_fiscal/flows/finalize_datapackage.py#L83

Bear in mind when fixing this that the domain might not be AWS, but an AWS-compatible service.

akariv commented 6 years ago

What is the malformed URL that was generated? It is assumed that the bucket name also serves as a domain name that's mapped to the bucket (regardless of provider). For openspending the name of the bucket is 'datastore.openspending.org'.

(e.g. http://datastore.openspending.org/cd511289b5773fff5e7efe328846eef3/my_test/final/datapackage.json)

brew commented 6 years ago

I see. In my local instance the bucket name isn't a domain, so it breaks. Can the assumption be documented.

akariv commented 6 years ago

Ah, ok. In this case we should probably add an 'S3_ENDPOINT' env var and construct the url in a more compatible way (we also need to change https -> http while we're at it): http://{S3_ENDPOINT}/{BUCKET}/...

brew commented 6 years ago

To get the local instance working with a S3-like datastore I had to add the S3_ENDPOINT_URL env var to os-conductor container (so it's available for dpp-fiscale/boto3). E.g. S3_ENDPOINT_URL: http://fakes3:4567. Otherwise boto will default to the AWS domain.

brew commented 6 years ago

Related to this: there is a certificate error for https://datastore.openspending.org/blahblah. Previously, datastore urls had the aws address and bucket name e.g. https://s3.amazonaws.com:443/datastore.openspending.org/338513c686c7da2f6f0553d8a0bfa3f8/leeds-council-spending-dec-2017/datapackage.json

Packages can't be edited with the current url schema, as either there is a certificate error with https, or an insecure 'Mixed Content' page error with http.

See: https://gitter.im/openspending/chat?at=5ae9b2a453ceca3604a6e118

brew commented 6 years ago

Fixed by https://github.com/openspending/os-packager/commit/05e0fb9d384da4eadeb7ed5d12b8dad9f8614257