Open flavio-schoute opened 2 years ago
I don't understand what I am doing wrong, I am just following the DOCS
The problem is that you are trying to train a Learner that is not compatible with continuous data with continuous data (i.e. word count vectors). If you'd like to stick with the Naive Bayes family of algorithms, you can train a Gaussian Naive Bayes estimator instead since it is compatible with continuous data.
https://docs.rubixml.com/1.0/classifiers/gaussian-naive-bayes.html
You can check to see which types an Estimator is compatible with in the API reference. In addition, we provide a cheat sheet in the User Guide.
Hi,
I am working with NB, but I get this error: Fatal error: Uncaught Rubix\ML\Exceptions\InvalidArgumentException: Naive Bayes (priors: [spam: 0.3, not spam: 0.7], smoothing: 2.5) is incompatible with continuous data types. in
And I cann't find anything about it on the internet.
This is my code:
`<?php
use Rubix\ML\Classifiers\KNearestNeighbors; use Rubix\ML\Classifiers\NaiveBayes; use Rubix\ML\CrossValidation\HoldOut; use Rubix\ML\CrossValidation\Metrics\Accuracy; use Rubix\ML\Datasets\Labeled; use Rubix\ML\Datasets\Unlabeled; use Rubix\ML\Extractors\CSV; use Rubix\ML\Tokenizers\NGram; use Rubix\ML\Transformers\NumericStringConverter; use Rubix\ML\Transformers\TextNormalizer; use Rubix\ML\Transformers\TfIdfTransformer; use Rubix\ML\Transformers\WordCountVectorizer;
require_once 'vendor/autoload.php';
$samples = [ ['dit is spam'], ['www.kanker.nl'], ['ik heb een vraag over deze shit'], ];
$labels = [ 'not spam', 'spam', 'not spam' ];
// ->apply(new TfIdfTransformer()) ->apply(new WordCountVectorizer(10000, 0.01, 0.9, new NGram(1, 2)));
$importedRecords = $dataset->count();
echo 'Important: ' . $importedRecords . '
';
$estimator = new NaiveBayes([ 'spam' => 0.3, 'not spam' => 0.7, ], 2.5);
[$training, $testing] = $dataset->randomize()->split(0.8);
$estimator->train($dataset);
//$trained = $estimator->trained(); //var_dump($trained);
//$predication = $estimator->predict($testing);
//$probabilities = $estimator->proba($dataset); //var_dump($probabilities);
//$metric = new Accuracy();
//$score = $metric->score($predication, $testing->labels());
//echo 'Score: ' . $score;
echo 'Final';`