Closed mih closed 1 year ago
Merging #47 (56d400f) into main (314b5a8) will increase coverage by
0.16%
. The diff coverage is100.00%
.:exclamation: Current head 56d400f differs from pull request most recent head 2c718b9. Consider uploading reports for the commit 2c718b9 to get more accurate results
@@ Coverage Diff @@
## main #47 +/- ##
==========================================
+ Coverage 99.08% 99.24% +0.16%
==========================================
Files 11 12 +1
Lines 218 265 +47
==========================================
+ Hits 216 263 +47
Misses 2 2
Impacted Files | Coverage Δ | |
---|---|---|
datalad_tabby/io/__init__.py | 99.15% <100.00%> (+0.18%) |
:arrow_up: |
datalad_tabby/io/tests/test_overrides.py | 100.00% <100.00%> (ø) |
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
Nice. Tested it locally, works as expected
Thanks!
I had trouble understanding how the overrides work from the docs description alone (and even tests), but with a few examples it became clear.
Assuming a doi field (single value):
doi https://doi.org/10.nnnnnn/example
the override can be a literal
{
"doi": true
}
# produces
{'doi': True}
a string:
{
"doi": "I am become {doi[0]}"
}
# produces
{'doi': 'I am become https://doi.org/10.nnnnnn/example'}
or a list:
{
"doi": ["Identifier", "doi", "{doi[0]}"]
}
# produces
{'doi': ['Identifier', 'doi', 'https://doi.org/10.nnnnnn/example`']}
Note: but an object would fall under JSON literal, hence no string substitution:
{
"doi": {
"type": "identifier",
"name": "doi",
"value": "{doi[0]}"
}
}
# produces
{'doi': {'name': 'doi', 'type': 'identifier', 'value': '{doi[0]}'},}
In the example above, we used the same key (doi
), replacing its value. But we could add a new key (e.g. doi-modified
) in the same way.
Thinking of vertical tables, I can imagine e.g a file listing mice in an experiment, where strain is entered (for brevity) as a numerical JAX code:
id strain_jax ...
01 018280
...
This could be overridden with the following (three new fields, two derived and one fixed):
{
"RRID": "RRID:IMSR_JAX:{strain_jax[0]}",
"url": "https://www.jax.org/strain/{strain_jax[0]}",
"schema": "https://custom_schema.org/mouseExperiment"
}
# produces
[
{"id": 01, "strain_jax": "018280", "RRID": "RRID:IMSR_JAX:018280", "url": "https://www.jax.org/strain/018280", "schema": "https://custom_schema.org/mouseExperiment"},
# ...
]
The key purpose of this feature is to enrich metadata with additional properties (keys), and to replace values of existing properties (keys) with other values. The "amend" use case (add values to a particular key) is only supported vai the proxy of replacing all values with a copy of the original values that has new items appended (because @mih thinks that this case is not common in typical metadata enrichment scenarios).
The current implementation allows for a unchanged TSV (author-provided) to be combined with an evolving override (curator-provided) to inject additional properties (e.g.,
@type
, or@id
) in the metadata record without altering the structure/content of the author-provided dcouments, thereby avoiding any friction or incompatibilities with the workflows or processes that yielded them in the first place.This makes it easier to feed back corrections to the original authors (or ask for them), and does not require any party to adjust to a different workflow.
Also: