Open CecSve opened 1 month ago
We can add tags to vocabularies(right now they are for concepts only). The tags also allow to choose a color if that helps.
Thanks that might be a good option. But should we then have a vocabulary for vocabulary status?
If we want to have a more controlled way to handle the status maybe it's not a good idea to use tags.
For the deprecated
status the api already supports deprecating a vocabulary. They can be listed as this:
https://api.gbif.org/v1/vocabularies?deprecated=true
And we have these fields to give info about the deprecation:
replacedByKey;
deprecated;
deprecatedBy;
We have an example in dev although it wasn't replaced by any other vocab(therefore the replacedByKey
is not shown):
https://api.gbif-dev.org/v1/vocabularies?deprecated=true
Also, for the in use
status it might mean different things for different vocabularies. For example, some vocabularies are used in the pipelines data interpretation but others are used only in the registry like the grscicoll ones and others might not be used in any system at all.
Another important thing is that in the pipelines data interpretation we use the latest released version of a vocabulary, which means that if there are changes in the vocabulary but haven't been released they aren't being used(even though the vocabulary will be in in use
status). The api allows to list the vocabularies that have unreleased changes although we can't see what the changes are:
https://api.gbif.org/v1/vocabularies?hasUnreleasedChanges=true
it also allows to query the latest release of a vocabulary:
https://api.gbif.org/v1/vocabularies/LifeStage/concepts/latestRelease
So instead of tags we can add these fields to the vocabulary to be more explicit about the status:
"usage": "pipelines data interpretation", // this can be an enumeration
"status": "in use" // we'll define some status so it's not a free-text field
Usage
should be set when the vocabulary is created and the status
should be updated manually although some can be set automatically. For example, a vocabulary is not being used if it hasn't been released at least once.
We can also add this read-only fields if it helps:
"released": "true",
"hasUnreleasedChanges": "false"
An improvement to that would be to also show the changes that are unreleased for the vocabularies that have unreleased changes. We talked briefly about this here: https://github.com/gbif/vocabulary/issues/132
We'll also have to show this properly in the UI.
What do you think @CecSve ?
EDIT: status
could also be a boolean called active
since we already have a status for deprecated
Thanks, that makes sense. So would the values would be controlled and documented for all the fields (usage
, status
, released
, hasUnreleasedChanges
)? Why would status
have boolean values, though? I would assume we would have at least three different values: active
, not active
and deprecated
.
Why would status have boolean values, though? I would assume we would have at least three different values: active, not active and deprecated.
Because we already have a field for deprecated, therefore it's only 2 possible status.
Why would status have boolean values, though? I would assume we would have at least three different values: active, not active and deprecated.
Because we already have a field for deprecated, therefore it's only 2 possible status.
Ok, so status
would be in use
, released
= true
, and hasUnreleasedChanges
= true
for newly released vocabularies that are not yet used in production?
Ok, so status would be in use, released = true, and hasUnreleasedChanges = true for newly released vocabularies that are not yet used in production?
Nope, I imagined it like this (renaming status
to active
so it's a boolean):
When it's first released but not used in production yet:
active: false
released: true
hasUnreleasedChanges: false
deprecated: null
when it's used in production:
active: true
released: true
hasUnreleasedChanges: false
deprecated: null
if at some point after the first release someone does changes to the vocab but it didn't release it the hasUnreleasedChanges
flag changes:
active: true
released: true
hasUnreleasedChanges: true
deprecated: null
if we deprecate the vocab:
active: false
released: true
hasUnreleasedChanges: false
deprecated: 020-03-31T12:41:10.914+00:00 // we use the date as we do in the deleted field of other entities such as dataset
Thank you. That makes sense!
I'm gonna leave this issue on pause because since we'd have to update some of the fields manually(the active
one for example) I think it's better to handle this in the documentation in our docs site.
We occasionally received feedback from people on vocabularies that are not yet used in pipelines. Could we add a feature (maybe a stop signal; green, yellow, red) for the different vocabularies to indicate whether it is
in use
,deprecated
ornot active
, andtesting
so it is more clear to users whether the vocabularies are in use?