Open franzos opened 5 months ago
@franzos If you have a better json model than the default, maybe you could pull request it up? :D
I'm not sure that it's better. I mostly trained it on data I got from form submissions over the years - of which 99% were SPAM; Now it seems to catch about ~ 90% of it - with some additional rules, like failing all messages with less than 10 characters, it's more like 95%.
There's a couple of things I'm working on:
I suppose with IP filtering, email testing, some basic rules (character repetition) and sieve, it might get to 99.5%. That's before LLM's though; Interestingly enough, these aren't in use much at all yet... Once they are, I guess this will be useless.
Hi there, awesome create! Trained on some ~18k messages and works really well.
In this example I'm training a new model:
Classifier::new(file_path, true)
classifier.save(file_path)
I didn't look too closely, but wouldn't something like this be enough:
Classifier::new(file_path)
<- default to new modelclassifier.save()
<- no need to supply pathIf I have time, I'll provide a PR.