frictionlessdata / goodtables.io

Data validation as a service. Project retired, got to the current one at frictionsless/repository
https://goodtables.io
GNU Affero General Public License v3.0
69 stars 16 forks source link

Restrict acces to /api/job by API_ACCESS_KEY #168

Closed roll closed 7 years ago

roll commented 7 years ago

Overview

It's alternative to #116 proposal for providing services for GODI. I think we could do this for Beta and design #116 in next iteration if needed.

Example

I've recovered work of /api/job in #167.

A client could query api/job based on this schema/documentation for payloads - https://github.com/frictionlessdata/goodtables.io/blob/datapackages-support/goodtablesio/schemas/validation-conf.yml

POST /api/job

{
    "source": [
        {"source": "https://raw.githubusercontent.com/frictionlessdata/goodtables-py/master/data/valid.csv"},
        {"source": "https://raw.githubusercontent.com/frictionlessdata/goodtables-py/master/data/invalid.csv"},
        {"source": "https://raw.githubusercontent.com/frictionlessdata/goodtables-py/master/data/datapackages/valid/datapackage.json", "preset": "datapackage"},
        {"source": "https://raw.githubusercontent.com/frictionlessdata/goodtables-py/master/data/datapackages/invalid/datapackage.json", "preset": "datapackage"}
    ],
    "settings": {
        "error_limit": 3
    }
}
2ccab76d-de98-4d78-a13a-696b3f6595ab

GET api/job/2ccab76d-de98-4d78-a13a-696b3f6595ab

{
  "conf": null,
  "created": "Sat, 04 Mar 2017 06:14:37 GMT",
  "status": "pending"
}

GET api/job/2ccab76d-de98-4d78-a13a-696b3f6595ab

{
  "conf": null,
  "created": "Sat, 04 Mar 2017 06:14:37 GMT",
  "error": null,
  "finished": "Sat, 04 Mar 2017 06:14:39 GMT",
  "id": "2ccab76d-de98-4d78-a13a-696b3f6595ab",
  "integration_name": "api",
  "report": {
    "error-count": 5,
    "errors": [],
    "table-count": 6,
    "tables": [
      {
        "error-count": 0,
        "errors": [],
        "headers": [
          "id",
          "name"
        ],
        "row-count": 3,
        "source": "https://raw.githubusercontent.com/frictionlessdata/goodtables-py/master/data/valid.csv",
        "time": 0.596,
        "valid": true
      },
      {
        "error-count": 3,
        "errors": [
          {
            "code": "blank-header",
            "column-number": 3,
            "message": "Header in column 3 is blank",
            "row": null,
            "row-number": null
          },
          {
            "code": "duplicate-header",
            "column-number": 4,
            "message": "Header in column 4 is duplicated to header in column(s) 2",
            "row": null,
            "row-number": null
          },
          {
            "code": "missing-value",
            "column-number": 3,
            "message": "Row 2 has a missing value in column 3",
            "row": [
              "1",
              "english"
            ],
            "row-number": 2
          }
        ],
        "headers": [
          "id",
          "name",
          "",
          "name"
        ],
        "row-count": 2,
        "source": "https://raw.githubusercontent.com/frictionlessdata/goodtables-py/master/data/invalid.csv",
        "time": 0.59,
        "valid": false
      },
      {
        "datapackage": "https://raw.githubusercontent.com/frictionlessdata/goodtables-py/master/data/datapackages/valid/datapackage.json",
        "error-count": 0,
        "errors": [],
        "headers": [
          "id",
          "name",
          "description",
          "amount"
        ],
        "row-count": 3,
        "source": "https://raw.githubusercontent.com/frictionlessdata/goodtables-py/master/data/datapackages/valid/data.csv",
        "time": 0.593,
        "valid": true
      },
      {
        "datapackage": "https://raw.githubusercontent.com/frictionlessdata/goodtables-py/master/data/datapackages/valid/datapackage.json",
        "error-count": 0,
        "errors": [],
        "headers": [
          "parent",
          "comment"
        ],
        "row-count": 4,
        "source": "https://raw.githubusercontent.com/frictionlessdata/goodtables-py/master/data/datapackages/valid/data2.csv",
        "time": 0.587,
        "valid": true
      },
      {
        "datapackage": "https://raw.githubusercontent.com/frictionlessdata/goodtables-py/master/data/datapackages/invalid/datapackage.json",
        "error-count": 1,
        "errors": [
          {
            "code": "blank-row",
            "column-number": null,
            "message": "Row 3 is completely blank",
            "row": [],
            "row-number": 3
          }
        ],
        "headers": [
          "id",
          "name",
          "description",
          "amount"
        ],
        "row-count": 4,
        "source": "https://raw.githubusercontent.com/frictionlessdata/goodtables-py/master/data/datapackages/invalid/data.csv",
        "time": 0.585,
        "valid": false
      },
      {
        "datapackage": "https://raw.githubusercontent.com/frictionlessdata/goodtables-py/master/data/datapackages/invalid/datapackage.json",
        "error-count": 1,
        "errors": [
          {
            "code": "blank-row",
            "column-number": null,
            "message": "Row 4 is completely blank",
            "row": [],
            "row-number": 4
          }
        ],
        "headers": [
          "parent",
          "comment"
        ],
        "row-count": 5,
        "source": "https://raw.githubusercontent.com/frictionlessdata/goodtables-py/master/data/datapackages/invalid/data2.csv",
        "time": 0.595,
        "valid": false
      }
    ],
    "time": 1.444,
    "valid": false
  },
  "source_id": null,
  "status": "failure"
}

Access

To restrict access we could use hard-coded for now API_ACCESS_KEY like we do in other projects. May be later we could expand this system giving users their keys via UI because it's a popular way for APIs instead of doing full auth(z).

Tasks


@pwalsh @amercader @brew

amercader commented 7 years ago

See my comments on #166

roll commented 7 years ago

@amercader Assigning you instead of me to decide. Not sure I'll be having time for it.

amercader commented 7 years ago

Restricted access to the API until we implement proper access via keys (if access by GODI is required we can add a special case)