tevpg / bikeparkingdb

Database and web reporting for bike parking data
GNU Affero General Public License v3.0
0 stars 0 forks source link

data exchange (import) format #2

Open tevpg opened 6 months ago

tevpg commented 6 months ago

Likely base on this as a JSON schema, which can contain multiple dates & sites. So a google sheet summary of multiple sites can go into one file, as can a single day of tagtracker data or 2wheel-valet data.

Each file has a list of one or more days:

File would have an org_code as well, to act as a double-check. Org is usually inferred from the upload location or similar system characteristics, but can be checked against this.... just in case.

When a converter is used on a file that only has visits (eg two wheel valet) then the converter can figure out the summary totals, and can even guess at the open/close times

Could extend this schema to do validations like:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "Bike Parking Data Exchange File schema",
  "description": "This sets up the validation schema for github.com/tevpg/bikeparkingdb",
  "version": "1.0",
  "type": "object",
  "properties": {
    "org_code": {
      "type": "string",
      "description": "Code for the organization responsible to collect the bike parking data."
    },
    "days": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "date": {
            "type": "string",
            "format": "date",
            "description": "Date in YYYY-MM-DD format.",
            "pattern": "^\\d{4}-\\d{2}-\\d{2}$"
          },
          "site_code": {
            "type": "string",
            "description": "Code for the specific bike parking site."
          },
          "time_open": {
            "type": "string",
            "format": "time",
            "description": "Opening time for the bike parking site in 24h HH:MM format.",
            "pattern": "^(0[0-9]|1[0-9]|2[0-3]):[0-5][0-9]$"
          },
          "time_closed": {
            "type": "string",
            "format": "time",
            "description": "Closing time for the bike parking site in 24h HH:MM format.",
            "pattern": "^(0[0-9]|1[0-9]|2[0-3]):[0-5][0-9]$"
          },
          "bikes_regular": {
            "type": "integer",
            "description": "Number of regular-sized bikes parked."
          },
          "bikes_oversize": {
            "type": "integer",
            "description": "Number of oversize bikes parked."
          },
          "bikes_total": {
            "type": "integer",
            "description": "Total number of bikes parked."
          },
          "bikes_registered": {
            "type": "integer",
            "description": "Number of registered bikes parked."
          },
          "visits": {
            "type": "array",
            "items": {
              "type": "object",
              "properties": {
                "time_in": {
                  "type": "string",
                  "format": "time",
                  "description": "Time the bike entered the parking site in 24h HH:MM format.",
                  "pattern": "^(0[0-9]|1[0-9]|2[0-3]):[0-5][0-9]$"
                },
                "time_out": {
                  "type": "string",
                  "format": "time",
                  "description": "Time the bike left the parking site in 24h HH:MM format.",
                  "pattern": "^(0[0-9]|1[0-9]|2[0-3]):[0-5][0-9]$"
                },
                "bike_type": {
                  "type": "string",
                  "description": "Type of bike, 'R' for regular or 'O' for oversize.",
                  "pattern": "^[RrOo]$"
                },
                "bike_id": {
                  "type": "string",
                  "description": "Optional identifier for the bike (e.g.: tag ID)."
                }
              },
              "required": ["time_in", "bike_type"]
            }
          }
        },
        "required": ["date", "site_code", "time_open", "time_closed", "bikes_regular", "bikes_oversize", "bikes_total", "bikes_registered"]
      }
    }
  },
  "required": ["org_code", "days"]
}
tevpg commented 6 months ago

I wonder if this is necessary. It was to be a middle-form file format --

various --> adapters  -->  xchg  --> loader  --> database
clients                    fmt

But I wonder if this makes more sense:

various  --> network  --> client   --> various --> database
clients      fetch        formats      loaders

Because:

It's nice to have that exchange format schema though, as a spec to give [whoever]. And of course, writing a loader for that would be straightforward