jeffheaton / encog-dotnet-core

http://www.heatonresearch.com/encog
Other
429 stars 149 forks source link

Text classification to .NET #107

Open silvacaio opened 7 years ago

silvacaio commented 7 years ago

I am doing a software to classify texts about health. So, I have many texts classified in "positive" and "negative" and these are the texts for my training. The user will put the new texts and my software will evaluate and say with this text is "positive" or "negative". Today, I use Weka and Naïve Bayes to this, but I would like of a framework specific to .NET.

So, I have found the code above with a similar case, but I have found the "BagOfWord" to .NET.

https://github.com/encog/encog-java-examples/blob/master/src/main/java/org/encog/examples/ml/bayesian/BayesianSpam.java

It's possible user Encog for this? How?

Thank's

YuriyZaletskyy commented 7 years ago

So, try to make software that imitates behavior of doctor. What as usually happens when person visits a doctor? Person says what he feels and doctor gives some additional question, with some additional analysis gives diagnosis. I recommend you not just allow to user to enter any text, but also to add some questionnaire and it will improve performance of your app.

silvacaio commented 7 years ago

YuriyZaletskyy, I really like of your suggestion and I will think about this.

About the texts classification, do you know of is it possible with Encog? I need this because it is a teacher's request to my school paper.

Thanks

YuriyZaletskyy commented 7 years ago

As you already mentioned text classification as usually is implemented with technique that is known as Bag of words. In the example that you mentioned Encog uses Bag of words for text classification with idea to does some text belong to class spam or does not belong. If your task sounds like: "detect whether words description belong to disease a" then you can use Encog and actually you have almost completed your school paper. But with Encog uses BayesianNetwork class for this task. But if you need neural network approach, then consider recurrent neural networks ( they are good for sequences ). Encog also have some of RNN implemented. If you need something more sophisticated ( LSTM ) I recommend you consider deeplearning4j.

silvacaio commented 7 years ago

YuriyZaletskyy for now, a need only this: "detect whether words description belong to disease a".

But, the example is to Java and when I have implemented to C#, I have not found the BagOfWords. Do you know if this is implemented or not?

Thanks for the Help YuriyZaletskyy

jeroldhaas commented 7 years ago

Wouldn't a SOM (self ordering map) do well with this, for a start?

Cheers, Jerold Haas

Senior Software Developer ASP.NET ASP.NET MVC C#, F#, .NET PHP 4/5 JavaScript / ActionScript 2 & 3 MSSQL, MySQL, PostgreSQL, NoSQL XHTML 1.0, HTML5, CSS2, CSS3

https://www.twitter.com/jeroldhaas https://jeroldhaas.flavors.me https://www.linkedin.com/in/jeroldhaas

On Mon, May 22, 2017 at 8:25 AM, silvacaio notifications@github.com wrote:

YuriyZaletskyy for now, a need only this: "detect whether words description belong to disease a".

But, the example is to Java and when I have implemented to C#, I have not found the BagOfWords. Do you know if this is implemented or not?

Thanks for the Help YuriyZaletskyy

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/encog/encog-dotnet-core/issues/107#issuecomment-303085323, or mute the thread https://github.com/notifications/unsubscribe-auth/AAzBiB52iq-L1AAzrf81kllaTOMwsXJVks5r8X7QgaJpZM4NhHHD .