JuliaText / WordTokenizers.jl

High performance tokenizers for natural language processing and other related tasks
Other
96 stars 25 forks source link

Write paper for JOSS #7

Closed oxinabox closed 4 years ago

oxinabox commented 6 years ago

This is blocked by #5 as it says that that is done.

codecov-io commented 6 years ago

Codecov Report

Merging #7 into master will not change coverage. The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master       #7   +/-   ##
=======================================
  Coverage   95.74%   95.74%           
=======================================
  Files           5        5           
  Lines          94       94           
=======================================
  Hits           90       90           
  Misses          4        4

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 882f439...6086657. Read the comment docs.

codecov-io commented 6 years ago

Codecov Report

Merging #7 into master will decrease coverage by 1.33%. The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master       #7      +/-   ##
==========================================
- Coverage   95.74%   94.41%   -1.34%     
==========================================
  Files           5       10       +5     
  Lines          94      555     +461     
==========================================
+ Hits           90      524     +434     
- Misses          4       31      +27
Impacted Files Coverage Δ
src/words/sedbased.jl 100% <0%> (ø) :arrow_up:
src/words/tweet_tokenizer.jl 91.15% <0%> (ø)
src/words/TokTok.jl 98.43% <0%> (ø)
src/words/fast.jl 98.14% <0%> (ø)
src/words/reversible_tokenize.jl 100% <0%> (ø)
src/words/nltk_word.jl 100% <0%> (ø)
src/words/simple.jl 100% <0%> (+16.66%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 882f439...53e5d1c. Read the comment docs.

Ayushk4 commented 5 years ago

Can I update this with information about TokenBuffer and its lexers to create custom tokenizers? And also other tokenizers - Reversible Tokenizer, Twitter Tokenizers?

oxinabox commented 5 years ago

yes, please do, and add self as a coauther, and @MikeInnes and @aquatiko too.

oxinabox commented 5 years ago

@MikeInnes @aquatiko @Ayushk4 as you are all listed as authors, can you review this, and if happy indicate so via github apprroval?

Then we can submit this

oxinabox commented 4 years ago

Sorry I forgot to action this. Last thing is I think should swap the author order and put @Ayushk4 as first author, as they have done much more on this recently than me.

If I don't hear any objections, I will do this and then will merge and submit

oxinabox commented 4 years ago

https://github.com/openjournals/joss-reviews/issues/1939