pelias / openaddresses

Pelias import pipeline for OpenAddresses.
MIT License
51 stars 43 forks source link

Downloads for north-america failing due to missing referer #482

Closed collinjc closed 3 years ago

collinjc commented 3 years ago

Describe the bug A few days ago, I opened this bug. After doing some digging, I believe I have confirmed that the issue seems to be related to openaddresses looking at the referer. When running curl -s -L -X GET -o test.zip https://results.openaddresses.io/latest/run/ca/ab/calgary.zip as is being done in the util, I observe the same behavior, which is that the file that is downloaded is simply the 403 response from openaddresses. If, however, curl -s -L -X GET -e https://results.openaddresses.io -o test.zip https://results.openaddresses.io/latest/run/ca/ab/calgary.zip is run, the file is downloaded successfully. Would it be possible to simply add the -e (referer) option to the curl command that is used by pelias?

Steps to Reproduce Steps to reproduce the behavior:

  1. Install a standard docker instance of pelias
  2. Move to the north-america project
  3. Run pelias download oa (after any prerequisites, if desired)
  4. Observe warning and following errors

Expected behavior pelias download oa completes successfully

Environment (please complete the following information):

References

https://github.com/pelias/docker/issues/257

iandees commented 3 years ago

Hi! OpenAddresses maintainer here.

I added referer checking to our CDN last week to defend against search engine crawlers that are causing more usage than normal, costing OpenAddresses a bunch of money and not serving our users. If you could use pelias-results.openaddresses.io as the referer instead of results, it would help us understand how Pelias users consume OpenAddresses data.

orangejulius commented 3 years ago

Thanks @collinjc for the report and workaround, @iandees for the help from the OA side, and @missinglink for the fast fix :)