open-contracting-archive / extensions-data-collector

Superseded by open-contracting/extension_registry.py
https://github.com/open-contracting/extension_registry.py
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Add Schemas, docs, codelists, readme to output #15

Closed odscjames closed 6 years ago

odscjames commented 6 years ago

Some of the new data structures have a bit of space (eg an extra dict), to make it easy to add extra keys later if/when we realise we need them. It's a machine readable file with plenty of space so why not?

Double check where the "en" language tags are - Most of the time they are very close to content that will be translated. This makes an assumption that if we have a bit of content in one language we will want some matching content in another language.

eg.

The one thing I haven't added language tags to yet is codelists - I wanted to check something here.

Am I right in thinking that every code list should have at the least a "Code" column, and every entry in that column should be unique? And that there can be any other number of other columns, depending on what it's being used for?

And that if we have a codelist in English with 3 items - Codes "A", "B" and "C" - then we should have codelist in Spanish also with 3 items - Codes "A", "B" and "C"? ie If the English one had 3 items and the Spanish one 4 items that's wrong?

And the codes aren't translated, just the other fields (etc Title, Description, etc)?

So marking this WIP but @kindly if you want to look over now and comment please do!

jpmckinney commented 6 years ago

There's a JSON Schema for codelists, which is used in this test.

In brief: Code, Title, and Description are required. Description is exceptionally not required for a couple codelists. And a "minus" codelist like -partyRole.csv only has Code. Codes are unique. There can be other columns, but the schema needs to be updated if unprecedented columns are added.

Only Title, Description and Extension (only present in the consolidated extension of a profile) are translated.

The way translation is implemented, it would be impossible for a translated version of a codelist to have more/fewer/different codes.

odscjames commented 6 years ago

@jpmckinney Thanks.

Will push one more commit then mark this ready for @kindly to review.