Closed DecentMakeover closed 4 years ago
The fastText text classification format expects one instance per line, but your file still has line breaks. Your model sees only the text of the lines starting with __label__
, everything else (i.e. most of the document) is ignored. However, try replacing the \n
with a single space.
@severinsimmler okay thanks, ill check that.
But isnt it supposed to error out if it does not have one instance per line?
I think each line that does not start with __label__
is considered as not relevant, like e.g. comments, so raising an error is probably not the expected behavior.
okay,thanks for the help
Hi ,Thanks for sharing your work.I am trying to run text classification on the 20newsgroup dataset, but the fscore does not go higher than 60.I just wanted to check if i have formatted the labels correctly,below i have posted the first few elements in the dataset in my csv,could anyone comment on this?