I have used magpie for multi-label text classification before and found that it's a powerful tool.
Recently, I tried to use magpie to run a binary text classification. And I have 217 cases totally and split them into 4:1 for training and testing in this research. But I got the output result like this:
answerICD= tumor positive,,
2
Predict: 0 tumor positive
tumor positive 0.51141936
tumor negative 0.5093596
answerICD= tumor positive,,
2
Predict: 0 tumor positive
tumor positive 0.51141936
tumor negative 0.5093596
answerICD= tumor negative,
1
tumor positive 0.51141936
tumor negative 0.5093596
answerICD= tumor negative,
1
tumor positive 0.51141936
tumor negative 0.5093596
As you can see , it output the same probabilities of these two labels in each testing case...it's result is pretty strange, so I want to ask is there any suggestion or explanation of this output result?
Thanks for your patient looking!
I have used magpie for multi-label text classification before and found
that it's a powerful tool.
Recently, I tried to use magpie to run a binary text classification. And I
have 217 cases totally and split them into 4:1 for training and testing in
this research. But I got the output result like this:
answerICD= tumor positive,, 2
As you can see , it output the same probabilities of these two labels in
each testing case...it's result is pretty strange, so I want to ask is
there any suggestion or explanation of this output result?
Thanks for your reply. I found that I didn't do the preprocessing procedure of line break problem so it only read the first line of text (all the same content) so it all output the same probability. It has been solved now.
Sorry for inconvenience and thanks for your kind reply.
This is happening to me as well, I'm trying to classify policy numbers and account numbers, both of which are alphanumeric. I trained the model and I'm always getting the same probabilities! Since the .txt files contain just 1 word, I changed the minimum number of words in word2vec to 1. Am I doing something wrong?
Hello,
I have used magpie for multi-label text classification before and found that it's a powerful tool.
Recently, I tried to use magpie to run a binary text classification. And I have 217 cases totally and split them into 4:1 for training and testing in this research. But I got the output result like this:
Predict: 0 tumor positive
tumor positive 0.51141936
tumor negative 0.5093596
Predict: 0 tumor positive
tumor positive 0.51141936
tumor negative 0.5093596
tumor positive 0.51141936
tumor negative 0.5093596
tumor positive 0.51141936
tumor negative 0.5093596 As you can see , it output the same probabilities of these two labels in each testing case...it's result is pretty strange, so I want to ask is there any suggestion or explanation of this output result? Thanks for your patient looking!
Can you describe a bit more about how you generated your test/label cases?
On Tue, Sep 25, 2018 at 9:58 AM Jessica10105009 notifications@github.com wrote:
--
Edan Krolewicz
Edan Krolewicz
*Research Automation, *DiscoverOrg
P: +1 360.783.6842 |
edan.krolewicz@discoverorg.com
Thanks for your reply. I found that I didn't do the preprocessing procedure of line break problem so it only read the first line of text (all the same content) so it all output the same probability. It has been solved now. Sorry for inconvenience and thanks for your kind reply.
This is happening to me as well, I'm trying to classify policy numbers and account numbers, both of which are alphanumeric. I trained the model and I'm always getting the same probabilities! Since the .txt files contain just 1 word, I changed the minimum number of words in word2vec to 1. Am I doing something wrong?
as mentioned in #158, Magpie is not of much help if your document contains only one word.