open-contracting / infrastructure

Documentation of the Open Contracting for Infrastructure Data Standards (OC4IDS) Toolkit
https://standard.open-contracting.org/infrastructure/
Other
6 stars 0 forks source link

Work items #297

Open duncandewhurst opened 3 years ago

duncandewhurst commented 3 years ago

SIF SOURCE collects data on the work items that make up an infrastructure project and the attributes of each work item, in order to support cost comparisons.

Work items may differ from the items being procured in the contracting processes associated with the project.

SIF SOURCE defines sector-specific lists of work items and attributes. Projects can have multiple of each type of work item and users can set a title for each work item, along with its attributes. For example, a dam project might have one 'Dam' work item for the primary dam and another 'Dam' work item for the secondary dam.

Here's an example from SOURCE of the work items and attributes for the dam sector:

Sector Work Item Category Description Quantity
DAM Dam Type [Concrete (mass)
Concrete (Arch)
Rockfill
Concrete-face rockfill dam
Earth fill
Other]
length (m) height (m)
  Turbine Type [Francis
Pelton
Kaplan
Other]
Capacity (MW) Number
  Spillway Type [Without gates
With segment gates
With sluice gates
Other]
- Number
  Reservoir Sediment managment [Buypassing
Sluicing
Dredging
Flushin
Other]
surface (sqm) volume (cbm)
  Tunnel Type of lining [Concrete
Steel
Other]
section (sqm) length (m)
  Penstock Installation [Above ground
Underground]
section (sqm) elevation (m)
  Surge Chamber Type of lining [Concrete
Steel
Other]
section (sqm) height (m)
  Operational housing Structure [Concrete
Metallic
Timber
Other]
Capacity (inhabitants) Gross floor area (sqm)
  Other building Structure [Concrete
Metallic
Timber
Other]
Purpose [Free text] Gross floor area (sqm)
  Access Road Pavement [Ashalt
Concrete
Unpaved
Other]
width (m) length (km)
  Car Parking Pavement [Asphalt
Concrete
Other]
Number of car parking spaces total area (sqm)
  Other Category [Free text] Unit [Free text] Quantity

The semantics of category, description and quantity differ depending on the work item, so I think it's best to consider these all as attributes.

Reusing the modeling from https://github.com/open-contracting/standard/issues/751 would look like this:

{
  "projects": [
    {
      "workItems": [
        {
          "id": "1",
          "name": "Dam",
          "title": "Primary Dam",
          "attributes": [
            {
              "id": "1",
              "name": "Type",
              "value": "Concrete (mass)"
            },
            {
              "id": "2",
              "name": "length",
              "value": "500",
              "unit": {
                "name": "Meters",
                "id": "MTR",
                "scheme": "UNCEFACT"
              }
            },
            {
              "id": "3",
              "name": "height",
              "value": "30", 
              "unit": {
                "name": "Meters",
                "id": "MTR",
                "scheme": "UNCEFACT"
              }
            }
          ]
        },
        {
          "id": "2",
          "name": "Dam",
          "title": "Secondary Dam",
          "attributes": [
            {
              "id": "1",
              "name": "Type",
              "value": "Earth fill"
            },
            {
              "id": "2",
              "name": "length",
              "value": "100",
              "unit": {
                "name": "Meters",
                "id": "MTR",
                "scheme": "UNCEFACT"
              }
            },
            {
              "id": "3",
              "name": "height",
              "value": "20",
              "unit": {
                "name": "Meters",
                "id": "MTR",
                "scheme": "UNCEFACT"
              }
            }
          ]
        },
        {
          "id": "3",
          "name": "Car Parking",
          "title": "Primary car park",
          "attributes": [
            {
              "id": "1",
              "name": "Pavement",
              "value": "Asphalt"
            },
            {
              "id": "2",
              "name": "Number of car parking spaces",
              "value": "10"
            },
            {
              "id": "3",
              "name": "Total area",
              "value": "150",
              "unit": {
                "name": "Square metre",
                "id": "MTK",
                "scheme": "UNCEFACT"
              }
            }
          ]
        }
      ]
    }
  ]
}

Discussion

@jpmckinney - please could you share your thoughts on the overall approach and the following:

jpmckinney commented 3 years ago

I think instead of numeric IDs we should recommend human-readable IDs, e.g. a lowercase, abbreviated version of the name.

The semantics of category, description and quantity differ depending on the work item, so I think it's best to consider these all as attributes.

Can you share some examples of the different semantics? I want to be sure we have no choice but to use attributes, whose downside is that it is fairly semantics-free from the schema's perspective.

Should we recommend that SOURCE nest workItems under a SIFSOURCE object to avoid clashes if we find different approaches to modeling work items in other jurisdictions?

I don't recommend it. If we end up using the same model later, we have an unnecessary difference. If we use a different model, SIFSOURCE will be a special case either way: workItems will need to be processed differently whether nested or not.

Is there an issue with pairing a string value field with the unit object?

I don't see an issue. What was your concern?

jpmckinney commented 3 years ago

Oh, we can also allow either strings or numbers in value.

duncandewhurst commented 3 years ago

I think instead of numeric IDs we should recommend human-readable IDs, e.g. a lowercase, abbreviated version of the name.

How would that work for the example in which there are two dams?

The semantics of category, description and quantity differ depending on the work item, so I think it's best to consider these all as attributes.

Can you share some examples of the different semantics? I want to be sure we have no choice but to use attributes, whose downside is that it is fairly semantics-free from the schema's perspective.

If the work item is a reservoir, then the category is the type of sediment management, but if the work item is a car park, then the category is the type of pavement.

If the work item is a turbine, then the description is its capacity in megawatts, but if the work item is an access road, then the description is its width in meters.

Users can also add an 'other' work item and enter a free-text category.

Should we recommend that SOURCE nest workItems under a SIFSOURCE object to avoid clashes if we find different approaches to modeling work items in other jurisdictions?

I don't recommend it. If we end up using the same model later, we have an unnecessary difference. If we use a different model, SIFSOURCE will be a special case either way: workItems will need to be processed differently whether nested or not.

:thumbsup:

Is there an issue with pairing a string value field with the unit object?

I don't see an issue. What was your concern?

Oh, we can also allow either strings or numbers in value.

Great, my concern was just that elsewhere unit is paired with a number field and I couldn't think of any other fields which use a string to represent a (strictly) numeric value. I also remembered concerns about allowing id fields to be either strings or numbers.

jpmckinney commented 3 years ago

I think instead of numeric IDs we should recommend human-readable IDs, e.g. a lowercase, abbreviated version of the name.

How would that work for the example in which there are two dams?

Sorry - I meant only for the attributes.

The semantics of category, description and quantity differ depending on the work item, so I think it's best to consider these all as attributes.

Can you share some examples of the different semantics? I want to be sure we have no choice but to use attributes, whose downside is that it is fairly semantics-free from the schema's perspective.

If the work item is a reservoir, then the category is the type of sediment management, but if the work item is a car park, then the category is the type of pavement.

If the work item is a turbine, then the description is its capacity in megawatts, but if the work item is an access road, then the description is its width in meters.

Users can also add an 'other' work item and enter a free-text category.

Hmm, we might find that some attributes are common enough to promote as individual fields, but for now we can leave them under attributes until we aggregate more demand / collect more evidence.

Should we recommend that SOURCE nest workItems under a SIFSOURCE object to avoid clashes if we find different approaches to modeling work items in other jurisdictions?

I don't recommend it. If we end up using the same model later, we have an unnecessary difference. If we use a different model, SIFSOURCE will be a special case either way: workItems will need to be processed differently whether nested or not.

👍

Can you update the examples you shared to not embed any fields under SIFSOURCE? For the same reasons we no longer use X- prefixes, we should not have publisher-specific nesting.

I also remembered concerns about allowing id fields to be either strings or numbers.

IDs need to be compared (e.g. organization reference ID to party ID), in which case a simple == comparison will fail if the types are different. Numeric IDs don't lose anything by becoming strings.

A number that represents a length, etc., however, would lose something as a string: no arithmetic, no numeric comparison, possibility of invalid numbers, etc.

duncandewhurst commented 3 years ago

Can you update the examples you shared to not embed any fields under SIFSOURCE? For the same reasons we no longer use X- prefixes, we should not have publisher-specific nesting.

Done!

duncandewhurst commented 3 years ago

A number that represents a length, etc., however, would lose something as a string: no arithmetic, no numeric comparison, possibility of invalid numbers, etc.

Yes, that is my concern with the approach in the item attributes extension, which uses a string value field.

The value of an attribute in the proposed model can be either a string or a number, so let's allow a string or a number.

duncandewhurst commented 3 years ago

Here's an updated example, with the changes discussed above:

{
  "projects": [
    {
      "workItems": [
        {
          "id": "1",
          "name": "Dam",
          "title": "Primary Dam",
          "attributes": [
            {
              "id": "type",
              "name": "Type",
              "value": "Concrete (mass)"
            },
            {
              "id": "length",
              "name": "length",
              "value": 500,
              "unit": {
                "name": "Meters",
                "id": "MTR",
                "scheme": "UNCEFACT"
              }
            },
            {
              "id": "height",
              "name": "height",
              "value": 30,
              "unit": {
                "name": "Meters",
                "id": "MTR",
                "scheme": "UNCEFACT"
              }
            }
          ]
        },
        {
          "id": "2",
          "name": "Dam",
          "title": "Secondary Dam",
          "attributes": [
            {
              "id": "type",
              "name": "Type",
              "value": "Earth fill"
            },
            {
              "id": "length",
              "name": "length",
              "value": 100,
              "unit": {
                "name": "Meters",
                "id": "MTR",
                "scheme": "UNCEFACT"
              }
            },
            {
              "id": "height",
              "name": "height",
              "value": 20,
              "unit": {
                "name": "Meters",
                "id": "MTR",
                "scheme": "UNCEFACT"
              }
            }
          ]
        },
        {
          "id": "3",
          "name": "Car Parking",
          "title": "Primary car park",
          "attributes": [
            {
              "id": "pavement",
              "name": "Pavement",
              "value": "Asphalt"
            },
            {
              "id": "parkingSpaces",
              "name": "Number of car parking spaces",
              "value": 10
            },
            {
              "id": "area",
              "name": "Total area",
              "value": 150,
              "unit": {
                "name": "Square metre",
                "id": "MTK",
                "scheme": "UNCEFACT"
              }
            }
          ]
        }
      ]
    }
  ]
}