Open mratsim opened 5 years ago
Hi Mamy~!
What kind of example are you looking for? I'm pretty interested in helping with this. Could you provide any more details on what you envision out of this?
Related, I made a naive hashing vectorizer implementation for a nim demo at work - might also be somewhat related - https://github.com/metasyn/nim-vectorizer-splunk/tree/master/src - of course, using arraymancer.
It can be Sentiment analysis on imdb (positive/negative) like https://www.kaggle.com/c/word2vec-nlp-tutorial.
Or for example author of short snippet detection: https://www.kaggle.com/c/spooky-author-identification.
I.e. something short, ideally the tokenizer can just be splitWhitespace
.
On the tasks to implement this:
nn
foldernn_dsl
Dataset + Downloader = https://github.com/mratsim/Arraymancer/pull/317
We have an embedding layer (#312), we have GRU with sequence support (#283).
We miss a dataset and an NLP example. The IMDB dataset is probably the one to have first: http://ai.stanford.edu/~amaas/data/sentiment/
Alternatively, we can use character level RNN instead of word level RNN which avoids the tokenizer issue (#316).