Closed josephsdavid closed 2 years ago
Cool thanks for this! I've added a few comments though a more serious review will likely come from the active maintainers.
Thank you for your reviews! 😄
Merging #8 (3629791) into master (aa6031f) will increase coverage by
7.27%
. The diff coverage isn/a
.
@@ Coverage Diff @@
## master #8 +/- ##
==========================================
+ Coverage 74.54% 81.81% +7.27%
==========================================
Files 1 1
Lines 55 55
==========================================
+ Hits 41 45 +4
+ Misses 14 10 -4
Impacted Files | Coverage Δ | |
---|---|---|
src/MLJNaiveBayesInterface.jl | 81.81% <ø> (+7.27%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update aa6031f...3629791. Read the comment docs.
@josephsdavid I've not yet reviewed, but here's a suggestion for the multinomial example (it needs #9):
using MLJ
import TextAnalysis
CountTransformer = @load CountTransformer pkg=MLJText
MultinomialNBClassifier = @load MultinomialNBClassifier pkg=NaiveBayes
tokenized_docs = TextAnalysis.tokenize.([
"I am very mad. You never listen.",
"You seem to be having trouble? Can I help you?",
"Our boss is mad at me. I hope he dies.",
"His boss wants to help me. She is nice.",
"Thank you for your help. It is nice working with you.",
"Never do that again! I am so mad. ",
])
sentiment = [
"negative",
"positive",
"negative",
"positive",
"positive",
"negative",
]
mach1 = machine(CountTransformer(), tokenized_docs) |> fit!
# matrix of counts:
X = transform(mach1, tokenized_docs)
# to ensure scitype(y) <: AbstractVector{<:OrderedFactor}:
y = coerce(sentiment, OrderedFactor)
classifier = MultinomialNBClassifier()
mach2 = machine(classifier, X, y)
fit!(mach2, rows=1:4)
# probabilistic predictions:
y_prob = predict(mach2, rows=5:6) # distributions
pdf.(y_prob, "positive") # probabilities for "positive"
log_loss(y_prob, y[5:6])
# point predictions:
yhat = mode.(y_prob) # or `predict_mode(mach2, rows=5:6)`
@josephsdavid I've not yet reviewed, but here's a suggestion for the multinomial example (it needs #9):
for now i just MLJ.table
ed the data and it seems to work fine!
Attention @ablaom
Thanks @josephsdavid for your contribution. Great to have another out of the way.