Closed paleolimbot closed 2 years ago
Merging #5 (c03bbcd) into main (64cf5b5) will increase coverage by
15.87%
. The diff coverage is97.35%
.
@@ Coverage Diff @@
## main #5 +/- ##
===========================================
+ Coverage 78.19% 94.07% +15.87%
===========================================
Files 2 4 +2
Lines 133 624 +491
Branches 7 16 +9
===========================================
+ Hits 104 587 +483
- Misses 22 27 +5
- Partials 7 10 +3
Impacted Files | Coverage Δ | |
---|---|---|
src/arrowc/schema.c | 73.98% <75.00%> (-3.01%) |
:arrow_down: |
src/arrowc/schema_view.c | 98.86% <98.86%> (ø) |
|
src/arrowc/metadata.c | 100.00% <100.00%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 64cf5b5...c03bbcd. Read the comment docs.
API looks generally good IMHO. Seems to miss some metadata iteration utility, though. Also, I don't understand why the PR title mentions schema validation. I don't see a validation method besides syntax validation of the format?
I don't see a validation method
I added this in and renamed the methods so that we have parse == walk the format string extracting information (or erroring when this fails) and validate == check anything else. ArrowSchemaViewInit()
does both.
How are list, struct, etc. types handled here, since that's effectively the same issue?
ArrowSchemaViewInit()
now walks the children and dictionary if they exist and does the validation step, although maybe they shouldn't (the caller will probably have to do this anyway).
I implemented enough metadata utilities to have a struct ArrowStringView
for the extension_name
/metadata. I wasn't aware that you could have a storage type that is an extension type in the C Data interface, so I don't know how that would be handled.
This is a PR drafting the design for a data structure that holds information about a data type as communicated by a
struct ArrowSchema
. The structure, tentatively called astruct ArrowSchemaView
, can be stack or statically allocated. Usage could be something like:Some thoughts:
I'm not sure how to handle extension types...it may have to be separate from the
data_type
/storage_data_type
since in theory one could have an extension type with a date/time storage type (and it seems confusing to have three levels of data type). It seems within scope to have anextension_name
member (that is astruct ArrowStringView
).There is some obvious repetition in the implementation and tests...I was going for completeness rather than proper abstraction to make sure I was able to catch all the fields that were needed.
One could also abandon the stack-allocatability route and have heap allocated fields and require the caller to do something like
schema_view.release(&schema_view)
to clean up...I didn't think there were too many fields such that the struct is unwieldy and not having to manage that memory is a helpful interface perk.One could also provide field getter functions rather than advertising the use of struct fields directly. That seems like it might be easier to maintain backwards compatibility if adding some verbosity to the interface.