dkpro / dkpro-cassis

UIMA CAS processing library written in Python
https://pypi.org/project/dkpro-cassis/
Apache License 2.0
85 stars 22 forks source link

#135 - Use uima style offsets #136

Closed jcklie closed 4 years ago

jcklie commented 4 years ago

Test it e.g. via:

from cassis import *

TOKEN = "de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token"

with open("TypeSystem.xml", "rb") as f:
    typesystem = load_typesystem(f)

with open("smileys.xmi", "rb") as f:
    cas = load_cas_from_xmi(f, typesystem=typesystem)

for token in cas.select(TOKEN):
    print(token.get_covered_text())

Or import a document with that text into INCEpTION.

Before:

Hello
😊,

y 
ame 
s 
ohn.

After:

Hello
😊
,
my
name
is
John
.
codecov[bot] commented 4 years ago

Codecov Report

Merging #136 into master will increase coverage by 0.10%. The diff coverage is 98.57%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #136      +/-   ##
==========================================
+ Coverage   97.69%   97.79%   +0.10%     
==========================================
  Files           9        9              
  Lines        1603     1725     +122     
==========================================
+ Hits         1566     1687     +121     
- Misses         37       38       +1     
Impacted Files Coverage Δ
cassis/cas.py 97.66% <97.43%> (-0.05%) :arrow_down:
cassis/xmi.py 98.26% <100.00%> (+0.05%) :arrow_up:
tests/fixtures.py 100.00% <100.00%> (ø)
tests/test_xmi.py 100.00% <100.00%> (ø)
tests/test_typesystem.py 100.00% <0.00%> (ø)
cassis/typesystem.py 96.56% <0.00%> (+0.02%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update b7eaf79...92ae590. Read the comment docs.