DigitalCommons / owned-by-oxford-project

Owned by Oxford
0 stars 0 forks source link

Start supporting new fields from Airtable #12

Closed ColmMassey closed 2 years ago

ColmMassey commented 2 years ago

OBO team will add structure, activities and relationships fields to the OBO Organisations.

For structure they will use Economic Activities/Sectors (Modified) For activities they will use Organisational Structure

They are exploring if they will keep to the existing options available under Type of Relationship field which would be local or refine it so that it could be used generically for CWB relationships.

ColmMassey commented 2 years ago

If we can generate these lists as an output from https://github.com/SolidarityEconomyAssociation/map-sse/issues/2, they can be loaded directly into Airtable and used as dropdown options for entries in the respective field.

wu-lee commented 2 years ago

Researching way to do this.... Airtable is annoying. They have really low limits on the number of items which can be inserted (10) or exported (100) from a table via the API. This makes something which could be relatively simple, complicated.

I'll mention here again, we already suffer from this limit in the sausage-factory download, which would need some extra coding to deal with this API's foibles. Consequentially, once the organisations table goes over this limit, the new organisations will not show up on the map, which is bad. There will be no error or warning when this happens, which makes it worse.

wu-lee commented 2 years ago

None of the options here really work that well for our case: https://community.airtable.com/t/use-an-external-database-or-service-as-a-datasource/44156

As such I expect the simplest thing. if we don't want to invest too much time coding around this problem, is to manually import the vocab terms as tables and instruct the ObO users never to change them.

This then leaves a problem that references to other tables in the exported JSON data looks like the "People" field in this fragment from the sausage factory download:

    {
      "id": "rec571TGQMf82V8C9",
      "fields": {
        "People": [
          "recjQ68eaWXPmxBDm"
        ],
        "Organisation Name": "TnT National Funder",
        "Activities and events": [
          "recEDFvOwgWDWr7M0"
        ],
        "Type of Relationship": [
          "Community Activist or Innovator"
        ],
        "Engagement": "engaged",
        "TnT Funding": [
          "reccs7bnC2yWzDeev",
          "recGxWuDdiK8Gxwdp"
        ],
        "Surname (from People) 2": [
          "Peters"
        ],
        "First Name (from People)": [
          "TnT Gary "
        ],
        "Consent (from People)": [
          "reccBctXSmj5TlOda"
        ]
      },
      "createdTime": "2020-11-13T09:59:38.000Z"
    },

We would have to implement some sort of transormation back into vocab URIs, or maybe human-readable strings, from the arbitrary IDs invented by Airtable, in this case recjQ68eaWXPmxBDm.

These IDs may change on each import. If we did have something mapping defined in the sea-map config as I proposed, that would need updating on each change, or things may silently start going awry. This would be awkward, and it might be more reliable instead to code something to get this from Airtable on sea-map start-up, as we do for the RDF vocabs. But this again is more custom coding.

wu-lee commented 2 years ago

Reflecting on this ID problem, I notice that the "Type of Relationship" field is represented in the data as a value, not an identifier... It is a (multiple) select field - not a link to another table, as for "People". There's a single select type, too.

This would make things a bit easier in that, having entered the selection list (manually?) we can then rely on the values coming back in the data matching up with any configured index in sea-map. Or we could choose to just show them as raw string values.

So I think we should do that: import the vocab lists manually once, add code in sea-map to allow configuration of an index to the URIs corresponding to these values , and then hope we never have any changes to the vocab. (I don't see a way to change a field's definition in the API, so it would have to be manually changed.)

ColmMassey commented 2 years ago

As such I expect the simplest thing. if we don't want to invest too much time coding around this problem, is to manually import the vocab terms as tables and instruct the ObO users never to change them.

I think we need to go with this. They need to know that this is from a standard vocabulary, so they can't mess with it, but request changes via us.

ColmMassey commented 2 years ago

The popup seems to be interpreting Organisation Structure as Legal Form Modified?

Untitled

ColmMassey commented 2 years ago

I note that in the new Internationalisatiin spreadsheet, the EN description of Organisational Structure is "Alternative SSE Initiative or SSE Network Legal Form controlled vocabulary for SSE" which is wrong, but obviously not the cause of the dialog bug above. I can;t find any occurance of Legal Form Modified in the spreadsheet. It must be hard coded in the dialog or looking somewhere else at an older incorrect version of the vocabulary.

wu-lee commented 2 years ago

The popup seems to be interpreting Organisation Structure as Legal Form Modified?

I think that's because that's what the vocab's title is defined to be in the version the dataset is referencing (the older V2a vocabs)

https://github.com/SolidarityEconomyAssociation/map-sse/blob/5e044bf388f40b9109ebff4a9b7ea3b3df649643/vocabs/standard/organisational-structure.skos#L13

The dataset should be migrated to the new vocabs, and the new vocabs fixed too.

ColmMassey commented 2 years ago

In the interim can we correct that misnaming as Legal Form Modified is completley wrong and should be organisational structure?

ColmMassey commented 2 years ago

The dataset should be migrated to the new vocabs, and the new vocabs fixed too. How long should this take?

wu-lee commented 2 years ago

Looking at this now. The tricky thing is that on a map which (at least in the private version) supports so many different datasets which use varying vocabularies and versions thereof, it's hard to see how to define some common category to index them by - as that requires a common vocabulary (and a common version thereof).

Right now we have a mix, and some of those (coops-uk) won't be quick to migrate - relates to https://github.com/SolidarityEconomyAssociation/technology-and-infrastructure/issues/65:

  $ perl -nE 'say "$ARGV:\t$1" if /PREFIX essglobal: (.*)/' config/*/query.rq
config/coops-uk/query.rq:   <https://w3id.solidarityeconomy.coop/essglobal/V2a/vocab/>
config/covid-mutual-aid/query.rq:   <https://dev.lod.coop/essglobal/2.1/vocab/>
config/dotcoop/query.rq:    <https://w3id.solidarityeconomy.coop/essglobal/V2a/vocab/>
config/good-food-oxford/query.rq:   <https://dev.lod.coop/essglobal/2.1/vocab/>
config/owned-by-oxford/query.rq:    <https://w3id.solidarityeconomy.coop/essglobal/V2a/vocab/>
config/oxford/query.rq: <https://w3id.solidarityeconomy.coop/essglobal/V2a/vocab/>

I can however migrate the ObO dataset, so that the public map is free of the problem.

[edit] The private ObO map avoids this by indexing by short postcode. The public one can avoid it when indexed by activity, by only having one dataset - ObO. This means the version of the vocab can be changed without breaking any of the other datasets' indexing.

wu-lee commented 2 years ago

Ok, now done. Required switching the version of ESSGLOBAL in both obo-public site, owned-by-oxford (private) site, and the owned-by-oxford data generation.

ColmMassey commented 2 years ago

I think that is safe to close.