open-contracting / kingfisher-collect

Downloads OCDS data and stores it on disk
https://kingfisher-collect.readthedocs.io
BSD 3-Clause "New" or "Revised" License
13 stars 12 forks source link

Write scrapers for new Paraguay API #218

Closed duncandewhurst closed 4 years ago

duncandewhurst commented 4 years ago

This is a high priority as OCP need to check the API is working so they can sign-off the project.

The documentation is here: http://beta.dncp.gov.py/datos/api/v3/doc/

It looks to use the same token/authentication model as the existing Paraguay API.

To get all the data, I think we need to write scrapers for only the following endpoints:

If I understood correctly, the list of ocids to pass as parameters can be obtained from the csv files here: https://www.contrataciones.gov.py/datos/convocatorias

@yolile - please confirm I got all the above correct, thank you!

duncandewhurst commented 4 years ago

Write scrapers for new Paraguay API

yolile commented 4 years ago

You can use the service http://beta.dncp.gov.py/datos/api/v3/doc/search/processes To list all the existing processes (only 2018 and 2019 are loaded for now) and use the field compiledRelease.ocid to get fue full releases and records with the two services that you already mentioned. Also the token now expires after 15 minutes always, no with a certain number of requests.

El jue., 28 de nov. de 2019 02:53, Duncan Dewhurst notifications@github.com escribió:

https://camo.githubusercontent.com/193766a3b9959c5f4ed5cd8cf3251d015e839d51/68747470733a2f2f6769746875622e7472656c6c6f2e73657276696365732f696d616765732f6d696e692d7472656c6c6f2d69636f6e2e706e67 Write scrapers for new Paraguay API https://trello.com/c/QfFio5Jf/157-write-scrapers-for-new-paraguay-api

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/open-contracting/kingfisher-scrape/issues/218?email_source=notifications&email_token=ABUEPSXNGUTYH2MVCYTVWRTQV5MF7A5CNFSM4JSPVSZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFLQWEI#issuecomment-559352593, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABUEPSSM4FIVOXEOD4VXGW3QV5MF7ANCNFSM4JSPVSZQ .

duncandewhurst commented 4 years ago

@yolile suggested that @romifz can write the scraper

jpmckinney commented 4 years ago

Noting that there are some issues mentioned in CRM 4864, though the scraper was updated in #227.

odscjames commented 4 years ago

Finding: The Proxy was working on IPv4 only - I added IPv6 support. It now works for both HTTP and HTTPS.

root@process1 ~ # curl -x http://staging.docs.opencontracting.uk0.bigv.io:8888 http://example.com <!doctype html>

Example Domain ............. root@process1 ~ # curl -x http://staging.docs.opencontracting.uk0.bigv.io:8888 https://www.google.com Google Githubissues.
  • Githubissues is a development platform for aggregating issues.