Closed VivekPanyam closed 2 years ago
Merging #527 (b444b99) into master (5c89647) will increase coverage by
0.01%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #527 +/- ##
==========================================
+ Coverage 86.56% 86.58% +0.01%
==========================================
Files 105 104 -1
Lines 6930 6940 +10
==========================================
+ Hits 5999 6009 +10
Misses 931 931
Impacted Files | Coverage Δ | |
---|---|---|
...e/neuropod/backends/python_bridge/python_bridge.hh | 100.00% <ø> (ø) |
|
source/python/neuropod/utils/dtype_utils.py | 100.00% <ø> (ø) |
|
...thon_bridge/_neuropod_native_bootstrap/executor.py | 92.64% <100.00%> (ø) |
|
...e/neuropod/backends/python_bridge/python_bridge.cc | 83.24% <100.00%> (-0.70%) |
:arrow_down: |
source/neuropod/bindings/python_bindings.cc | 95.45% <100.00%> (+0.82%) |
:arrow_up: |
source/python/neuropod/loader.py | 68.05% <100.00%> (+2.90%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 5c89647...b444b99. Read the comment docs.
Summary:
This PR modifies Neuropod string handling to assume that strings are
UTF-8
encoded. This adds support for things like emoji 😎 (and all the other characters supported by unicode), but does have a few downsides and introduces some complexity.I'm working on some docs that dig into the difficulties of string tensors and especially string tensors + unicode, but they're not super relevant for the functionality in this PR so I'll post that separately.
The major impact of this PR is the following change.
Breaking change: opaque sequences of bytes will no longer generally work as strings (because the python bindings now expect strings to be valid UTF-8 strings).
This can be mitigated by adding a
bytes
type to Neuropod that supports arbitrary byte sequences. Please comment on this PR or open an issue if this is an important use case for you and we can prioritize it.Test Plan:
CI + modified the string test to use emojis