Closed xrotwang closed 2 years ago
Merging #60 (0135a16) into master (03ec862) will not change coverage. The diff coverage is
100.00%
.
@@ Coverage Diff @@
## master #60 +/- ##
==========================================
Coverage 100.00% 100.00%
==========================================
Files 16 22 +6
Lines 2540 3514 +974
==========================================
+ Hits 2540 3514 +974
Impacted Files | Coverage Δ | |
---|---|---|
src/csvw/__init__.py | 100.00% <ø> (ø) |
|
src/csvw/__main__.py | 100.00% <100.00%> (ø) |
|
src/csvw/datatypes.py | 100.00% <100.00%> (ø) |
|
src/csvw/db.py | 100.00% <100.00%> (ø) |
|
src/csvw/dsv.py | 100.00% <100.00%> (ø) |
|
src/csvw/dsv_dialects.py | 100.00% <100.00%> (ø) |
|
src/csvw/jsonld.py | 100.00% <100.00%> (ø) |
|
src/csvw/metadata.py | 100.00% <100.00%> (ø) |
|
src/csvw/utils.py | 100.00% <100.00%> (ø) |
|
tests/conftest.py | 100.00% <100.00%> (ø) |
|
... and 10 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 03ec862...0135a16. Read the comment docs.
So while this is mostly backwards compatible, the sheer volume of additions would suggest calling this "csvw 3.0" - which is a bit funny because we only had 2.0.0 in the 2.x line, and only for a rather short time. Anyway, I think "3.0" would be the right version here - agreed?
Looking at the JSON created by csvw2json
, this really vindicates the design of CLDF. E.g. this is the first example given in the Leipzig Glossing Rules:
{
"http://cldf.clld.org/v1.0/terms.rdf#id": "1",
"http://cldf.clld.org/v1.0/terms.rdf#languageReference": "indo1316",
"http://cldf.clld.org/v1.0/terms.rdf#primaryText": "Mereka di Jakarta sekarang.",
"http://cldf.clld.org/v1.0/terms.rdf#analyzedWord": [
"Mereka",
"di",
"Jakarta",
"sekarang."
],
"http://cldf.clld.org/v1.0/terms.rdf#gloss": [
"They",
"in",
"Jakarta",
"now"
],
"http://cldf.clld.org/v1.0/terms.rdf#translatedText": "They are in Jakarta now.",
"http://cldf.clld.org/v1.0/terms.rdf#metaLanguageReference": "stan1293",
"http://cldf.clld.org/v1.0/terms.rdf#source": [
"Sneddon1996[237]"
]
}
It can be parsed without knowing anything about CSV dialects or the local table of column names.
What's somewhat missing from the JSON is the sources. But the bibtex file is linked
{
"dc:conformsTo": "http://cldf.clld.org/v1.0/terms.rdf#Generic",
"@type": "http://www.w3.org/ns/dcat#Distribution",
"dc:source": "sources.bib",
...
and the references are easy to parse:
"http://cldf.clld.org/v1.0/terms.rdf#source": [
"Sneddon1996[237]"
]
Looking forward to testing this actively. So far, from reading your comments, this looks very nice.
Testing this now. Thanks for all the work! One small thing: setup.py
needs requests-mock
as testing (?) dependency.
This PR is largely backwards compatible. The most notable change is a default Dialect for the objects in
csvw.metadata
withcommentPrefix=None
. The CSVW spec seems to be ambiguous - mentioning both"#"
andnull
as default at different places. What pushed me to go fornull
was the big number ofToJson
conformance tests for number formatting, which all used number patterns as column headers like#,##0.0#
. With the old default, none of these tests would pass.