Closed bavamont closed 4 years ago
Good catch! Did you upgrade versions recently? We added the $maxDocumentFrequency
parameter in 0.1.0-rc5 ... thanks for the reminder I am going to update the train script!
Let's try a setting of 5000 for maxDocumentFrequency ... let me know if you get better results with a different setting
Also if you'd like to join our channel on Telegram https://t.me/RubixML
Should be fixed in the latest update https://github.com/RubixML/Sentiment/commit/d076d864eece25ba91c792cb6ee49917f5147216
Thanks again @bavamont!
Thank you @andrewdalpino ! I’ll try it with 5000 for maxDocumentFrequency. Thanks again!
I am getting this error, when I am trying to train using your train.php (https://github.com/RubixML/Sentiment/blob/master/train.php) example: Fatal error: Uncaught TypeError: Argument 3 passed to Rubix\ML\Transformers\WordCountVectorizer::__construct() must be of the type int, object given....
In your example on Line 44 you have: new WordCountVectorizer(10000, 3, new NGram(1, 2)),
But the constuctor for WordCountVectorizer expects this: public function __construct( int $maxVocabulary = PHP_INT_MAX, int $minDocumentFrequency = 1, int $maxDocumentFrequency = PHP_INT_MAX, ?Tokenizer $tokenizer = null ) What would be your recommended parameters for WordCountVectorizer for your example to work best?