Open jayneeee opened 7 years ago
I wasn't able to reproduce this error. The collections library is imported in the file so there shouldn't be an import error, and the line indicated in the traceback isn't calling collections. Let me know if you have more details about the nature of the error.
Hi, I just solved the issue by changing "import collections" to "from collections import Counter". However I have another question which is regarding the input data format. isit have to be matrix?
Data points are expected to be represented via sparse encodings. They are assumed to be dictionaries, where keys (any hashable data type) are feature names and the values are the feature values. The training data matrix is expected to be a list of such dictionaries.
Hi ashkonf,
Thanks for your prompt reply. One more question, for the multinomial distribution NB, what are the differences comparing to sklearn multinomial NB? From what I understand, sklearn takes in any discrete number as input. Besides, do you add Laplace smoothing in your NB classifier?
Thanks Jieyan
On Jul 19, 2017, at 12:19 PM, Ashkon Farhangi notifications@github.com<mailto:notifications@github.com> wrote:
Data points are expected to be represented via sparse encodings. They are assumed to be dictionaries, where keys (any hashable data type) are feature names and the values are the feature values. The training data matrix is expected to be a list of such dictionaries.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ashkonf/HybridNaiveBayes/issues/5#issuecomment-316268276, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AVEV_BEx_FSuVibIVkcs4zxIuhsal8mAks5sPYOvgaJpZM4OVF7C.
The primary differentiators between this implementation and the Scikit Learn implementation are the following:
The Scikit Learn implementation of the Multinomial Naive Bayes model doesn't support continuous features, and models discrete features as unordered categorical variables rather than numerically ordered variables.
This implementation does support Laplace smoothing. See the smoothingFactor parameter of the Multinomial distribution class.
Hi,
Thanks again for your explaination. However, I have problem converting my data into the re
From: Ashkon Farhangi notifications@github.com Sent: Thursday, July 20, 2017 11:50 AM To: ashkonf/HybridNaiveBayes Cc: jayneeee; Author Subject: Re: [ashkonf/HybridNaiveBayes] Issue with "collections" library (#5)
The primary differentiators between this implementation and the Scikit Learn implementation are the following:
The Scikit Learn implementation of the Multinomial Naive Bayes model doesn't support continuous features, and models discrete features as unordered categorical variables rather than numerically ordered variables.
This implementation does support Laplace smoothing. See the smoothingFactor parameter of the Multinomial distribution class.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ashkonf/HybridNaiveBayes/issues/5#issuecomment-316589065, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AVEV_EGyAJH-rqWoMx4Qj9fL3SYoESYDks5sPs6DgaJpZM4OVF7C.
Hi,
Thanks again for your explanation. However, I have issues converting my data into the required format. I'm new in this area, I've deal with dictionary before but may i know how should I convert my csv data/ dataframe data into the format for the code?
thanks.
From: jieyan lai jieyan_lai@hotmail.com Sent: Thursday, July 27, 2017 1:03 PM To: ashkonf/HybridNaiveBayes Subject: Re: [ashkonf/HybridNaiveBayes] Issue with "collections" library (#5)
Hi,
Thanks again for your explaination. However, I have problem converting my data into the re
From: Ashkon Farhangi notifications@github.com Sent: Thursday, July 20, 2017 11:50 AM To: ashkonf/HybridNaiveBayes Cc: jayneeee; Author Subject: Re: [ashkonf/HybridNaiveBayes] Issue with "collections" library (#5)
The primary differentiators between this implementation and the Scikit Learn implementation are the following:
The Scikit Learn implementation of the Multinomial Naive Bayes model doesn't support continuous features, and models discrete features as unordered categorical variables rather than numerically ordered variables.
This implementation does support Laplace smoothing. See the smoothingFactor parameter of the Multinomial distribution class.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ashkonf/HybridNaiveBayes/issues/5#issuecomment-316589065, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AVEV_EGyAJH-rqWoMx4Qj9fL3SYoESYDks5sPs6DgaJpZM4OVF7C.
If your data is in matrix format (which data in a CSV file would be), there is a fairly easy way to transform it into dictionary format. Simply assign each column of your matrix a name. Then create one dictionary per row of your matrix, assigning the row's values to keys corresponding to column names. The list of such dictionaries will be your sparse matrix representation.