Constannnnnt / Distributed-CoreNLP

This infrastructure, built on Stanford CoreNLP, MapReduce and Spark with Java, aims at processing documents annotations at large scale.
https://github.com/Constannnnnt/Distributed-CoreNLP
MIT License
0 stars 0 forks source link

FuncTODO #14

Closed Constannnnnt closed 5 years ago

Constannnnnt commented 5 years ago

It seems that at this moment if we want to do a complex func, say "coref", it will print all the results of funcs before it in the pipeline?

Say, the input arg for the functionalities is cleamxml, ssplit, coref, it will execute all functions in the pipeline and add the results to the rdd.

((0,tokenize),The-1 University-2 of-3 Waterloo-4 is-5 located-6 in-7 Canada-8 .-9 Goose-1 lives-2 in-3 this-4 University-5 .-6)
((0,cleanxml),The-1 University-2 of-3 Waterloo-4 is-5 located-6 in-7 Canada-8 .-9 Goose-1 lives-2 in-3 this-4 University-5 .-6)
((0,ssplit),The University of Waterloo is located in Canada.|Goose lives in this University.)
((0,pos),(The,DT) (University,NNP) (of,IN) (Waterloo,NNP) (is,VBZ) (located,JJ) (in,IN) (Canada,NNP) (.,.) (Goose,NN) (lives,VBZ) (in,IN) (this,DT) (University,NNP) (.,.))
((0,ner),(The,O) (University,ORGANIZATION) (of,ORGANIZATION) (Waterloo,ORGANIZATION) (is,O) (located,O) (in,O) (Canada,COUNTRY) (.,O) (Goose,O) (lives,O) (in,O) (this,O) (University,O) (.,O))
((0,coref),The University of Waterloo:this University; )
((1,tokenize),The-1 University-2 of-3 Waterloo-4 is-5 located-6 in-7 Canada-8 .-9 Goose-1 lives-2 here-3 .-4)
((1,cleanxml),The-1 University-2 of-3 Waterloo-4 is-5 located-6 in-7 Canada-8 .-9 Goose-1 lives-2 here-3 .-4)
((1,ssplit),The University of Waterloo is located in Canada.|Goose lives here.)
((1,pos),(The,DT) (University,NNP) (of,IN) (Waterloo,NNP) (is,VBZ) (located,JJ) (in,IN) (Canada,NNP) (.,.) (Goose,NN) (lives,NNS) (here,RB) (.,.))
((1,ner),(The,O) (University,ORGANIZATION) (of,ORGANIZATION) (Waterloo,ORGANIZATION) (is,O) (located,O) (in,O) (Canada,COUNTRY) (.,O) (Goose,O) (lives,O) (here,O) (.,O))
((1,coref),)

I don't see any necessaries to do it and all I think here is just to return results after executing that three args.