When I re-run the existing system for 20news, I find that the input data text is like: "Newsgroups: rec.motorcycles\nPath: cantaloupe.srv.cs.cmu.edu!ro ...etc".
Am I right that you do not discard the header (which often contains the name of the newgroup label) during data processing?
When I re-run the existing system for 20news, I find that the input data text is like: "Newsgroups: rec.motorcycles\nPath: cantaloupe.srv.cs.cmu.edu!ro ...etc".
Am I right that you do not discard the header (which often contains the name of the newgroup label) during data processing?