Closed secsilm closed 5 years ago
+1 would also be interested in auto text classifier
I also interested in text classifier.
same here!
Thank you for the suggestion. We are working on this feature now. It would be online soon.
Thanks.
TextClassifier
Text classification. The input is some strings. Each string is an article, a sentence, a paragraph in English. The output is a single class label for each string.
Accuracy
Penn Treebank
NA
NA
NA
@boyuangong Please write a baseline for the TextClassifier. It needs to extend from the supervised.py. And reuse the class of ModelTrainer in utils.py. Currently, it doesn't need to search for multiple architectures. Just use a default architecture for now.
We can have a meeting to discuss if you have questions. Thanks.
Re: TextClassifier benchmarks: I would suggest starting with a document/sentence leve lclassification model (not word level/many to many). e.g. IMDB sentiment. (Common benchmark for sentiment classification of short documents. Used to benchmark ULMFit amongst others). https://github.com/keras-team/keras/blob/master/keras/datasets/imdb.py http://nlp.fast.ai/classification/2018/05/15/introducting-ulmfit.html
@ddofer Thank you for your suggestions.
When do you expect to release the initial TextClassifier model?
Hi,
The model is expected to be released around next week. I am in a vacation now back to my country. Sorry for any inconvenience.
-- Best Boyuan Gong
On Aug 29, 2018 at 7:26 AM, <ziyadi (mailto:notifications@github.com)> wrote:
When do you expect to release the initial TextClassifier model?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub (https://github.com/jhfjhfj1/autokeras/issues/35#issuecomment-416772617), or mute the thread (https://github.com/notifications/unsubscribe-auth/AV8DUJCF3XBzv6TUhnV81kYGnVNPWb7Nks5uVdGagaJpZM4VtU_w).
@boyuangong I can help you out with that module if that's okay.
Hi there,
Thanks a lot. I will send the pull request of the newst version tonight when I come back home. The model is pretty much complete now. Only have some run out of memory problem. It would be great to receive help from you.
-- Best Boyuan Gong
On Sep 15, 2018 at 1:56 PM, <Muhammad (mailto:notifications@github.com)> wrote:
@boyuangong (https://github.com/boyuangong) I can help you out with that module if that's okay.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub (https://github.com/jhfjhfj1/autokeras/issues/35#issuecomment-421616659), or mute the thread (https://github.com/notifications/unsubscribe-auth/AV8DUPG7diI91B-wkADAb3aMMEJJB4w5ks5ubU1ogaJpZM4VtU_w).
@boyuangong is this live yet? Is there a branch where the code lives currently? Very excited to see your work!
@boyuangong is this live yet? Is there a branch where the code lives currently? Very excited to see your work!
Thanks for your attention to our package! Now it’s moved to [WIP] textClassifier WIP #199. You can switch the branch to textClassifier for the latest code. It’s working now. But may need carefully choose the size of the input(you can customize the input length in the constant.py) to prevent the out of memory issue. I am currently waiting for code review and also working on the test coverage before merge it into the master branch.
Best, Boyuan
Woooooo!
Hi. Great, what is the status?
Some questions -What is the max length of each texts we want to classify? For example can, we proceed an average email ? -How many classes can we feed in the engine, so it could stay relatively average with not too much false positive ? 20, 50? I know many old classifier drop a lot in accuracy if there is too many classes. But recent one like the openSource from Facebook can stand really more. But I don't want to use anything from facebook.
-What is the format to feed for training and then testing ? I didn't see any docs and tutorials yet, is there?
Thanks, great job.
@renaudham Hi, thanks for your interesting in our work. We are preparing the tutorial documents for some modules including the textClassifier. You can directly check this in the new pull request 341. The text tutorial are in text.md.
Also, if you want to try the textClassifier. You can go and check the example in examples/text_cnn.
Please feel free to let me know if you have further questions.
I see
ImageClassifier
in the doc. Is there something likeTextClassifier
to process text?