LR-POR / cl-conllu

tool for working with conllu files in CL
Apache License 2.0
13 stars 5 forks source link

empty tokens #56

Open odanoburu opened 6 years ago

odanoburu commented 6 years ago

has support for empty tokens been added yet? I'm mirroring this issue from hs-conllu so we can fix this in both libraries. a CoNLL-U file that can be used for testing is this one.

(I'm asking and not testing myself because I'm getting this error when loading the library:

COMPILE-FILE-ERROR while
compiling #<CL-SOURCE-FILE "iterate" "iterate">
   [Condition of type UIOP/LISP-BUILD:COMPILE-FILE-ERROR]

)

arademaker commented 6 years ago

It is clear that we need to support empty nodes. See docs http://universaldependencies.org/format.html#words-tokens-and-empty-nodes

But your error is strange, we are not using the iterate library! thanks for report the problem.

odanoburu commented 6 years ago

But your error is strange, we are not using the iterate library!

not even as a transitive dependency? that's weird!

odanoburu commented 6 years ago

I've updated the code and it now works!

odanoburu commented 6 years ago

it doesn't parse here:

CL-USER> (defparameter *sents* (cl-conllu:read-conllu #P"/home/bruno/Documents/en-ud-train.conllu"))
; Evaluation aborted on #<SB-KERNEL:BOUNDING-INDICES-BAD-ERROR expected-type:
                                       (CONS (INTEGER 0 6) (INTEGER 7 6))
                                       datum: (7)>.
GPPassos commented 6 years ago

@odanoburu The sentence you linked gives a 404 error here.

odanoburu commented 6 years ago

https://github.com/UniversalDependencies/UD_English-EWT/blob/master/en_ewt-ud-train.conllu

eu daria a linha, mas pode mudar..