Hello, I am confused for how each sentence form should be. For example, a sentence is "It is important for machine learning." After tokenizing and lowercasing, it is a form of list [ it, is, important, for, machine learning ], then I want to write this result into a file, one line one sentence, should it be "it, is, important, for, machine learning", which I mean should each word separate by comma?
The tokens should be separated by spaces, if you want to form phrases such as 'machine learning' you can join them using some character such as '_': it is important for machine_learning
Hello, I am confused for how each sentence form should be. For example, a sentence is "It is important for machine learning." After tokenizing and lowercasing, it is a form of list [ it, is, important, for, machine learning ], then I want to write this result into a file, one line one sentence, should it be "it, is, important, for, machine learning", which I mean should each word separate by comma?